我正在为法语文本设计一个语言分析器。我有一个 XML 格式的字典,如下所示:
<?xml version="1.0" encoding="utf-8"?>
<Dictionary>
<!--This is the base structure for every entry in the dictionary. Values on attributes are given
as explanations for the attributes. Though this is the structure of the finished product for each word, definition, context and context examples will be ommitted as they don't have a real effect on the application at this moment. Defini-->
<Word word="The word in the dictionary (any word that would be defined)." aspirate="Whether or not the word starts with an aspirate h. Some adjectives that come before words that start with a non-aspirate h have an extra form (AdjectiveForms -> na [non-aspirate]).">
<GrammaticalForm form="The grammatical form of the word is the grammatical context in which it is used. Forms may consist of a word in noun, adjective, adverb, exclamatory or other form. Each form (generally) has its own definition, as the meaning of the word changes in the way it is used.">
<Definition definition=""></Definition>
</GrammaticalForm>
<ConjugationTables>
<NounForms ms="The masculin singular form of the noun." fs="The feminin singular form of the noun." mpl="The masculin plural form of the noun." fpl="The feminin plural form of the noun." gender="The gender of the noun. Determines"></NounForms>
<AdjectiveForms ms="The masculin singular form of the adjective." fs="The feminin singular form of the adjective." mpl="The masculin plural form of the adjective." fpl="The feminin plural form of the adjective." na="The non-aspirate form of the adjective, in the case where the adjective is followed by a non-aspirate word." location="Where the adjective is placed around the noun (before, after, or both)."></AdjectiveForms>
<VerbForms group="What group the verb belongs to (1st, 2nd, 3rd or exception)." auxillary="The auxillary verb taken by the verb." prepositions="A CSV list of valid prepositions this verb uses; for grammatical analysis." transitive="Whether or not the verb is transitive." pronominal="The pronominal infinitive form of the verb, if the verb allows pronominal construction.">
<Indicative>
<Present fps="(Je) first person singular." sps="(Tu) second person singular." tps="(Il) third person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural." tpp="(Ils) third person plural."></Present>
<SimplePast fps="(Je) first person singular." sps="(Tu) second person singular." tps="(Il) third person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural." tpp="(Ils) third person plural."></SimplePast>
<PresentPerfect fps="(Je) first person singular." sps="(Tu) second person singular." tps="(Il) third person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural." tpp="(Ils) third person plural."></PresentPerfect>
<PastPerfect fps="(Je) first person singular." sps="(Tu) second person singular." tps="(Il) third person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural." tpp="(Ils) third person plural."></PastPerfect>
<Imperfect fps="(Je) first person singular." sps="(Tu) second person singular." tps="(Il) third person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural." tpp="(Ils) third person plural."></Imperfect>
<Pluperfect fps="(Je) first person singular." sps="(Tu) second person singular." tps="(Il) third person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural." tpp="(Ils) third person plural."></Pluperfect>
<Future fps="(Je) first person singular." sps="(Tu) second person singular." tps="(Il) third person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural." tpp="(Ils) third person plural."></Future>
<PastFuture fps="(Je) first person singular." sps="(Tu) second person singular." tps="(Il) third person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural." tpp="(Ils) third person plural."></PastFuture>
</Indicative>
<Subjunctive>
<Present fps="(Je) first person singular." sps="(Tu) second person singular." tps="(Il) third person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural." tpp="(Ils) third person plural."></Present>
<Past fps="(Je) first person singular." sps="(Tu) second person singular." tps="(Il) third person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural." tpp="(Ils) third person plural."></Past>
<Imperfect fps="(Je) first person singular." sps="(Tu) second person singular." tps="(Il) third person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural." tpp="(Ils) third person plural."></Imperfect>
<Pluperfect fps="(Je) first person singular." sps="(Tu) second person singular." tps="(Il) third person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural." tpp="(Ils) third person plural."></Pluperfect>
</Subjunctive>
<Conditional>
<Present fps="(Je) first person singular." sps="(Tu) second person singular." tps="(Il) third person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural." tpp="(Ils) third person plural."></Present>
<FirstPast fps="(Je) first person singular." sps="(Tu) second person singular." tps="(Il) third person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural." tpp="(Ils) third person plural."></FirstPast>
<SecondPast fps="(Je) first person singular." sps="(Tu) second person singular." tps="(Il) third person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural." tpp="(Ils) third person plural."></SecondPast>
</Conditional>
<Imperative>
<Present sps="(Tu) second person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural."></Present>
<Past sps="(Tu) second person singular." fpp="(Nous) first person plural." spp="(Vous) second person plural."></Past>
</Imperative>
<Infinitive present="The present infinitive form of the verb." past="The past infinitive form of the verb."></Infinitive>
<Participle present="The present participle of the verb." past="The past partciple of the verb."></Participle>
</VerbForms>
</ConjugationTables>
</Word>
</Dictionary>
抱歉,它太长了,但是有必要准确地展示数据是如何建模的(树节点结构)。
目前我正在使用structs
为共轭表建模,嵌套structs
更具体。这是我创建的类来模拟什么单个条目在 XML 文件中:
class Word
{
public string word { get; set; }
public bool aspirate { get; set; }
public List<GrammaticalForms> forms { get; set; }
struct GrammaticalForms
{
public string form { get; set; }
public string definition { get; set; }
}
struct NounForms
{
public string gender { get; set; }
public string masculinSingular { get; set; }
public string femininSingular { get; set; }
public string masculinPlural { get; set; }
public string femininPlural { get; set; }
}
struct AdjectiveForms
{
public string masculinSingular { get; set; }
public string femininSingular { get; set; }
public string masculinPlural { get; set; }
public string femininPlural { get; set; }
public string nonAspirate { get; set; }
public string location { get; set; }
}
struct VerbForms
{
public string group { get; set; }
public string auxillary { get; set; }
public string[] prepositions { get; set; }
public bool transitive { get; set; }
public string pronominalForm { get; set; }
struct IndicativePresent
{
public string firstPersonSingular { get; set; }
public string secondPersonSingular { get; set; }
public string thirdPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
public string thirdPersonPlural { get; set; }
}
struct IndicativeSimplePast
{
public string firstPersonSingular { get; set; }
public string secondPersonSingular { get; set; }
public string thirdPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
public string thirdPersonPlural { get; set; }
}
struct IndicativePresentPerfect
{
public string firstPersonSingular { get; set; }
public string secondPersonSingular { get; set; }
public string thirdPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
public string thirdPersonPlural { get; set; }
}
struct IndicativePastPerfect
{
public string firstPersonSingular { get; set; }
public string secondPersonSingular { get; set; }
public string thirdPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
public string thirdPersonPlural { get; set; }
}
struct IndicativeImperfect
{
public string firstPersonSingular { get; set; }
public string secondPersonSingular { get; set; }
public string thirdPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
public string thirdPersonPlural { get; set; }
}
struct IndicativePluperfect
{
public string firstPersonSingular { get; set; }
public string secondPersonSingular { get; set; }
public string thirdPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
public string thirdPersonPlural { get; set; }
}
struct IndicativeFuture
{
public string firstPersonSingular { get; set; }
public string secondPersonSingular { get; set; }
public string thirdPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
public string thirdPersonPlural { get; set; }
}
struct IndicativePastFuture
{
public string firstPersonSingular { get; set; }
public string secondPersonSingular { get; set; }
public string thirdPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
public string thirdPersonPlural { get; set; }
}
struct SubjunctivePresent
{
public string firstPersonSingular { get; set; }
public string secondPersonSingular { get; set; }
public string thirdPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
public string thirdPersonPlural { get; set; }
}
struct SubjunctivePast
{
public string firstPersonSingular { get; set; }
public string secondPersonSingular { get; set; }
public string thirdPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
public string thirdPersonPlural { get; set; }
}
struct SubjunctiveImperfect
{
public string firstPersonSingular { get; set; }
public string secondPersonSingular { get; set; }
public string thirdPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
public string thirdPersonPlural { get; set; }
}
struct SubjunctivePluperfect
{
public string firstPersonSingular { get; set; }
public string secondPersonSingular { get; set; }
public string thirdPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
public string thirdPersonPlural { get; set; }
}
struct ConditionalPresent
{
public string firstPersonSingular { get; set; }
public string secondPersonSingular { get; set; }
public string thirdPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
public string thirdPersonPlural { get; set; }
}
struct ConditionalFirstPast
{
public string firstPersonSingular { get; set; }
public string secondPersonSingular { get; set; }
public string thirdPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
public string thirdPersonPlural { get; set; }
}
struct ConditionalSecondPast
{
public string firstPersonSingular { get; set; }
public string secondPersonSingular { get; set; }
public string thirdPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
public string thirdPersonPlural { get; set; }
}
struct ImperativePresent
{
public string secondPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
}
struct ImperativePast
{
public string secondPersonSingular { get; set; }
public string firstPersonPlural { get; set; }
public string secondPersonPlural { get; set; }
}
struct Infinitive
{
public string present { get; set; }
public string past { get; set; }
}
struct Participle
{
public string present { get; set; }
public string past { get; set; }
}
}
}
我是C#新手,对数据结构不太熟悉。基于我对 C++ 的有限知识,我知道structs
当您对小型、高度相关的数据进行建模时,它们非常有用,这就是我目前以这种方式使用它们的原因。
所有这些结构实际上都可以制成共轭表 class
,并且在很大程度上具有相同的结构。我不确定是否将它们放入一个类中,或者使用更适合该问题的不同数据结构。为了提供有关问题规范的更多信息,我将说以下内容:
- 从 XML 文件加载这些值后,它们将不被改变.
- 这些值将是经常读取/获取.
- 类似表格的结构必须维护- 也就是说
IndicativePresent
必须嵌套在VerbForms
;这同样适用于作为成员的所有其他结构VerbForms
结构。这些都是共轭tables毕竟!
-
也许最重要的是:我需要以某种方式设置数据的组织,例如
Word
XML 文件中没有GrammaticalForm
of verb,那没有VerbForms
实际上将为该条目创建结构。这是为了提高效率——为什么要实例化VerbForms
如果这个词实际上不是动词?这种避免不必要地创建这些“表单”表(当前表示为struct XXXXXForms
)是绝对必要的。
根据(主要)点#4上面,什么类型的数据结构最适合用于共轭建模tables (not database表)?我是否需要更改数据格式才能符合#4?如果我实例化一个new Word
,当前状态下的结构是否也会被实例化并占用大量空间?这是一些数学......在谷歌搜索并最终找到之后这个问题 https://stackoverflow.com/questions/5690922/sizeof-empty-string-in-c-sharp...
在所有的词形变化表(名词、形容词、动词)中,总共有(巧合?)100 string
s 已分配,并且是空的。所以100 x 18字节 =1800字节对于每个Word
,至少,如果这些数据结构被创建并保持为空(实际填充的值总是至少有一些开销)。所以假设(只是随机的,可能或多或少)50,000 Word
需要在内存中,那就是9000 万字节,或大约85.8307兆字节。
光是有空表就需要很大的开销。那么有什么方法可以将这些数据放在一起以允许我实例化仅某些表(名词、形容词、动词)根据what GrammaticalForms
the Word
实际上有(在 XML 文件中)。
我希望这些表成为Word
类,但仅实例化我需要的表。我想不出解决办法,现在我做了数学计算structs
我知道这不是一个好的解决方案。我的第一个想法是为每种类型创建一个类NounForms
, AdjectiveForms
, and VerbForms
,如果表单出现在 XML 文件中,则实例化该类。我不确定这是否正确......
有什么建议么?