如何使用 serde_json 将“NaN”反序列化为“nan”?

2024-02-28

我的数据类型如下所示:

#[derive(Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct Matrix {
    #[serde(rename = "numColumns")]
    pub num_cols: usize,
    #[serde(rename = "numRows")]
    pub num_rows: usize,
    pub data: Vec<f64>,
}

我的 JSON 主体看起来像这样:

{
    "numRows": 2,
    "numColumns": 1,
    "data": [1.0, "NaN"]
}

这是 Jackson 提供的序列化(来自我们使用的 Java 服务器),并且是有效的 JSON。不幸的是,如果我们打电话serde_json::from_str(&blob)我们得到一个错误:

Error("invalid type: string "NaN", expected f64", [snip]

我知道浮点数有一些微妙之处,人们对事情应该如何发展有非常固执的看法。我尊重。 Rust 特别喜欢固执己见,我喜欢这一点。

然而,最终我将收到这些 JSON blob,而且我需要它"NaN"反序列化为某些字符串f64值在哪里is_nan()为 true,并且序列化回字符串"NaN",因为生态系统的其余部分都使用 Jackson,这在那里没问题。

这可以通过合理的方式实现吗?

编辑:建议的链接问题讨论了重写派生的反序列化器,但它们没有解释如何具体反序列化浮点数。


实际上,在 Vec(或 Map 等)中使用自定义反序列化器似乎是 serde 上的一个开放问题,并且已经存在一年多了(截至撰写本文时):https://github.com/serde-rs/serde/issues/723 https://github.com/serde-rs/serde/issues/723

我相信解决方案是编写一个自定义反序列化器f64(这很好),以及使用的所有内容f64作为一个子事物(例如Vec<f64>, HashMap<K, f64>, ETC。)。不幸的是,这些东西似乎不是可组合的,因为这些方法的实现看起来像

deserialize<'de, D>(deserializer: D) -> Result<Vec<f64>, D::Error>
where D: Deserializer<'de> { /* snip */ }

一旦你有了反序列化器,你就只能通过访问者与它交互。

长话短说,我最终让它工作起来,但似乎有很多代码是不必要的。将其发布在这里,希望(a)有人知道如何清理它,或者(b)这确实应该如何完成,并且这个答案对某人有用。我花了一整天的时间热切地阅读文档并进行试验和错误猜测,所以也许这对其他人有用。功能(de)serialize_float(s)应与适当的#[serde( (de)serialize_with="etc." )]位于字段名称上方。

use serde::de::{self, SeqAccess, Visitor};
use serde::ser::SerializeSeq;
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use std::fmt;

type Float = f64;

const NAN: Float = std::f64::NAN;

struct NiceFloat(Float);

impl Serialize for NiceFloat {
    #[inline]
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: Serializer,
    {
        serialize_float(&self.0, serializer)
    }
}

pub fn serialize_float<S>(x: &Float, serializer: S) -> Result<S::Ok, S::Error>
where
    S: Serializer,
{
    if x.is_nan() {
        serializer.serialize_str("NaN")
    } else {
        serializer.serialize_f64(*x)
    }
}

pub fn serialize_floats<S>(floats: &[Float], serializer: S) -> Result<S::Ok, S::Error>
where
    S: Serializer,
{
    let mut seq = serializer.serialize_seq(Some(floats.len()))?;

    for f in floats {
        seq.serialize_element(&NiceFloat(*f))?;
    }

    seq.end()
}

struct FloatDeserializeVisitor;

impl<'de> Visitor<'de> for FloatDeserializeVisitor {
    type Value = Float;

    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter.write_str("a float or the string \"NaN\"")
    }

    fn visit_i32<E>(self, v: i32) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(v as Float)
    }

    fn visit_i64<E>(self, v: i64) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(v as Float)
    }

    fn visit_u32<E>(self, v: u32) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(v as Float)
    }

    fn visit_u64<E>(self, v: u64) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(v as Float)
    }

    fn visit_f32<E>(self, v: f32) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(v as Float)
    }

    fn visit_f64<E>(self, v: f64) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(v as Float)
    }

    fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        if v == "NaN" {
            Ok(NAN)
        } else {
            Err(E::invalid_value(de::Unexpected::Str(v), &self))
        }
    }
}

struct NiceFloatDeserializeVisitor;

impl<'de> Visitor<'de> for NiceFloatDeserializeVisitor {
    type Value = NiceFloat;

    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter.write_str("a float or the string \"NaN\"")
    }

    fn visit_f32<E>(self, v: f32) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(NiceFloat(v as Float))
    }

    fn visit_f64<E>(self, v: f64) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(NiceFloat(v as Float))
    }

    fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        if v == "NaN" {
            Ok(NiceFloat(NAN))
        } else {
            Err(E::invalid_value(de::Unexpected::Str(v), &self))
        }
    }
}

pub fn deserialize_float<'de, D>(deserializer: D) -> Result<Float, D::Error>
where
    D: Deserializer<'de>,
{
    deserializer.deserialize_any(FloatDeserializeVisitor)
}

impl<'de> Deserialize<'de> for NiceFloat {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        let raw = deserialize_float(deserializer)?;
        Ok(NiceFloat(raw))
    }
}

pub struct VecDeserializeVisitor<T>(std::marker::PhantomData<T>);

impl<'de, T> Visitor<'de> for VecDeserializeVisitor<T>
where
    T: Deserialize<'de> + Sized,
{
    type Value = Vec<T>;

    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter.write_str("A sequence of floats or \"NaN\" string values")
    }

    fn visit_seq<S>(self, mut seq: S) -> Result<Self::Value, S::Error>
    where
        S: SeqAccess<'de>,
    {
        let mut out = Vec::with_capacity(seq.size_hint().unwrap_or(0));

        while let Some(value) = seq.next_element()? {
            out.push(value);
        }

        Ok(out)
    }
}

pub fn deserialize_floats<'de, D>(deserializer: D) -> Result<Vec<Float>, D::Error>
where
    D: Deserializer<'de>,
{
    let visitor: VecDeserializeVisitor<NiceFloat> = VecDeserializeVisitor(std::marker::PhantomData);

    let seq: Vec<NiceFloat> = deserializer.deserialize_seq(visitor)?;

    let raw: Vec<Float> = seq.into_iter().map(|nf| nf.0).collect::<Vec<Float>>();

    Ok(raw)
}
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

如何使用 serde_json 将“NaN”反序列化为“nan”? 的相关文章

随机推荐