11/1/2022 0 Comments Jedict breen![]() There are 23,313 JMdict entries which qualify as "P" (about the same number as EDICT), but when you use JMdict, you can select only the "news1" entries (11,722), or "ichi1" entries (9,958) or "spec1" and "spec2" entries (3,310), or "nf01" through "nf09" (4,435) or only entries that are tagged "news1" AND "ichi1" (5,356), or any other combination of priority tags that you can think of. If you use EDICT, you have no way to detect which priority tag was used for that entry.with JMdict, you have much finer control. as you can see, "P" is really just a catch-all tag that means an entry has ANY of those five priority tags. "xx" is the number of the set of 500 words in which the entry can be found, with "01" assigned to the first 500, "02" to the second, and so on.Įntries with news1, ichi1, spec1/2 and gai1 values are marked with a "(P)" in the EDICT and EDICT2 files. nfxx: this is an indicator of frequency-of-use ranking in the wordfreq file. gai1/2: common loanwords, also based on the wordfreq file.Į. ![]() ![]() (The entries marked "ichi2" were demoted from ichi1 because they were observed to have low frequencies in the spec1 and spec2: a small number of words use this marker when they are detected as being common, but are not included in other lists.ĭ. ichi1/2: appears in the "Ichimango goi bunruishuu", Senmon Kyouiku Publishing, Tokyo, 1998. Jedict breen archive#(See the Monash ftp archive for a copy.) Words in the first 12,000 in that file are marked "news1" and words in the second 12,000 are marked "news2".ī. news1/2: appears in the "wordfreq" file compiled by Alexandre Girardi from the Mainichi Shimbun. However, this tag is really a simplification of data which is much more complete in JMdict.this is from the JMdict/EDICT documentation:Ī. This is GREAT for using the dictionary as an educational tool, because you can filter the 178,272 entries down to just 22,021 which contain it. If that alone doesn't convince you to use JMdict, I'm not sure what else will, but there's one more important distinction I'd like to make: EDICT entries contain a very useful tag "P", which I assume stands for "priority" or "popular", because those are considered to be the most common words in the language. of a waterfall) / fall distance (2) difference / gap ![]() (1) (n) difference in elevation (between two points in a body of water) / head / drop (e.g. I mean, what the hell is that even supposed to mean? "Headwaters" is the term for the source of a stream or river.is that what "a head (of water)" means? It has nothing to do with "difference", though, so that's probably not it.for comparison, this is the definition currently given in JMdict: I've been using an Anki deck generated from EDICT2 for just over a year now, and some of the definitions are fairly poor, so I just got in the habit of looking up words I didn't understand on ALC (something I highly recommend no matter what dictionary you use).but eventually I ran into this entry, which finally made me give up on EDICT: This more or less directly causes the second problem: the quality of entries in EDICT2 is much lower than JMdict. I'm sure it was perfectly synchronized with JMdict at that time, but the most recent JMdict was created on. EDICT files are generated from JMdict, so both of them contain almost precisely the same information, but there are a few key differences.įirst, EDICT2 is not maintained very well.the current version of edict2.gz that's held on the official site (as of today, ) was generated on. This is already well-known to many people, but I wish there had been a thread like this a year ago, when I started using EDICT2 hardcore. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |