-
Notifications
You must be signed in to change notification settings - Fork 216
Description
I discovered this while trying to make YAML deserialization work with Sorbet enum classes, but this could happen with any custom domain type. I wrote a custom serializer using add_domain_type and a corresponding implementation of encode_with that I defined on T::Enum only to find that, if I serialize an object with the same enum value more than once, anything besides the first instance would actually just be a Hash with the properties I serialized on it. I have a simple example that repros the behavior:
require "yaml"
require "json"
::YAML.add_domain_type('test/custom_yaml', 'my_test_object') do |type, value|
MyTestClass.new(prop1: value['prop1'], prop2: value['prop2'])
end
class MyTestClass
attr_reader :prop1, :prop2
def initialize(prop1:, prop2:)
@prop1 = prop1
@prop2 = prop2
end
def encode_with(coder)
coder.tag = '!test/custom_yaml:my_test_object'
coder.map = {
'prop1' => @prop1,
'prop2' => @prop2
}
end
end
test_obj = MyTestClass.new(prop1: 13, prop2: 1989)
generated = ::Psych.dump(
{
'object1' => test_obj,
'object2' => test_obj,
'object3' => test_obj
}
)
puts "GENERATED YAML:"
puts generated
puts "END"
puts "PARSED YAML:"
puts ::Psych.safe_load(generated, permitted_classes: [MyTestClass], aliases: true)
puts "END"Gives me the output:
GENERATED YAML:
---
object1: &1 !test/custom_yaml:my_test_object
prop1: 13
prop2: 1989
object2: *1
object3: *1
END
PARSED YAML:
{"object1" => #<MyTestClass:0x0000000120b4cfa0 @prop1=13, @prop2=1989>, "object2" => {"prop1" => 13, "prop2" => 1989}, "object3" => {"prop1" => 13, "prop2" => 1989}}
END
where the values of object2 and object3 in the resulting hash are actually hashes of the serialized properties, rather than an instance of MyTestClass
I believe the bug is that when objects are deserialized via domain types, the result isn't registered, such that subsequent alias references don't get the parsed version of the object. If I add in a register method inside of Psych::Visitors::ToRuby.accept, I get the same object for all 3 references in my example:
require "yaml"
require "json"
module Psych
module Visitors
class ToRuby
def accept(target)
result = super
unless @domain_types.empty? || !target.tag
key = target.tag.sub(/^[!\/]*/, '').sub(/(,\d+)\//, '\1:')
key = "tag:#{key}" unless key.match?(/^(?:tag:|x-private)/)
if @domain_types.key? key
value, block = @domain_types[key]
result = block.call value, result
register(target, result)
end
end
result = deduplicate(result).freeze if @freeze
result
end
end
end
end
::YAML.add_domain_type('test/custom_yaml', 'my_test_object') do |type, value|
MyTestClass.new(prop1: value['prop1'], prop2: value['prop2'])
end
class MyTestClass
attr_reader :prop1, :prop2
def initialize(prop1:, prop2:)
@prop1 = prop1
@prop2 = prop2
end
def encode_with(coder)
coder.tag = '!test/custom_yaml:my_test_object'
coder.map = {
'prop1' => @prop1,
'prop2' => @prop2
}
end
end
test_obj = MyTestClass.new(prop1: 13, prop2: 1989)
generated = ::Psych.dump(
{
'object1' => test_obj,
'object2' => test_obj,
'object3' => test_obj
}
)
puts "GENERATED YAML:"
puts generated
puts "END"
puts "PARSED YAML:"
puts ::Psych.safe_load(generated, permitted_classes: [MyTestClass], aliases: true)
puts "END"Gives me the output:
GENERATED YAML:
---
object1: &1 !test/custom_yaml:my_test_object
prop1: 13
prop2: 1989
object2: *1
object3: *1
END
PARSED YAML:
{"object1" => #<MyTestClass:0x00000001015dc918 @prop1=13, @prop2=1989>, "object2" => #<MyTestClass:0x00000001015dc918 @prop1=13, @prop2=1989>, "object3" => #<MyTestClass:0x00000001015dc918 @prop1=13, @prop2=1989>}
END
Machine details
Apple M3 Macbook Air
Reproduced on: "ruby 3.4.9 (2026-03-11 revision 76cca827ab) +PRISM [arm64-darwin25]"