CINXE.COM
WWWJDIC - INFORMATION
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=euc-jp"> <title>WWWJDIC - INFORMATION</title> </head> <body bgcolor="ivory"> <a name="top_tag"> <!-- <CENTER><TABLE cellpadding=15 cellspacing=2 border=0> --> <!-- <TR> --> <!-- <TD align="left"><a href="http://nihongo.monash.edu/tiny.png" alt="Jim's picture"></a></td> --> <!-- <TD><center><h1>Jim Breen's WWWJDIC<br>User Guide</h1> --> <!-- </center></td> --> <!-- <td><img src="http://assets.monash.edu/logos/logo-small.gif" alt="Monash Logo" align=right> --> <!-- </TD></TR></TABLE></CENTER> --> <center><h1>WWWJDIC Japanese Dictionary Server<br>User Guide</h1> </center> <TABLE cellpadding=15 cellspacing=2 border=0> <TR> <TD align="left" valign = "top"> Contents: </td> <TD align="left"> <font size="-1"> <a href="#intro_tag">Introduction</a> <a href="#opins_tag">Operating Instructions</a> <a href="#trans_tag">Translating Text</a> <a href="#dicfil_tag">Dictionary Files</a> <a href="#mult_tag">Multi-Radical</a> <a href="#links_tag">Links</a> <a href="#examp_tag">Examples</a> <a href="#verb_tag">Verb Conjugations</a> <a href="#newsub_tag">Submitting Amendments & New Entries</a> <a href="#sod_tag">Stroke Order Diagrams</a> <a href="#multling_tag">Japanese Interface</a> <a href="#code_tag">Codes</a> <a href="#copyr_tag">Copyright</a> <a href="#faq_tag">FAQ</a> <a href="#whatsnew">What's New</a> <a href="#history">History</a> <a href="#planimp_tag">Planned Improvements</a> <a href="#bugs_tag">Known Bugs</a> <a href="#browser_tag">Browsing in Japanese</a> <a href="#techbits_tag">Technical Bits</a> <a href="#repbugs_tag">Bug Reports</a> <a href="#mirror_tag">Mirrors</a> <a href="#backdoor_tag">Backdoor Entry/API</a> <a href="#feed_tag">Feedback</a> <a href="#don_tag">Donations</a> <a href="#disc_tag">Disclaimer</a> <a href="#ack_tag">Acknowledgements</a> </font> </td></tr></table> <p> <font size="-1"> Last updated: 11 Oct 2022. </font> <h2><a name="intro_tag"> INTRODUCTION</a></h2> <p>Welcome to <b>WWWJDIC</b>, the dictionary server operated by the Electronic Dictionary Research and Development Group (EDRDG) and associated with the <a href="http://www.edrdg.org/wiki/index.php/JMdict-EDICT_Dictionary_Project"> JMdict/EDICT</a> and <a href="http://www.edrdg.org/wiki/index.php/KANJIDIC_Project">KANJIDIC</a> projects. <p>Please note that this server is intended for people who have studied some Japanese and who can read at least kana. There is no display of romanized Japanese. <p>WWWJDIC operates at several <b>mirror sites</b> around the globe. All sites carry identical information. Check <a href= "http://www.edrdg.org/wwwjdic/wwwjdicmirrors.html">here</a> for the location of the nearest mirror site. <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="opins_tag">OPERATING INSTRUCTIONS</a></h2> These are minimal, as the operation of WWWJDIC is intended to be as intuitive and self-explanatory as possible. There is an <a href="#faq_tag">FAQ</a> section at the back of this page. <h3>Romaji</h3> <p> Care is needed with the form of romaji used for <em>input</em>. WWWJDIC expects "wapuro romaji", i.e. it should be typed as though it was going into a Japanese-capable Input Method (IM or IME), e.g. with an editor or word-processor. For example: <ul> <li> long vowels in native Japanese words must be in the romaji equivalents of the kana form. Thus it is "toukyou" and "oosaka". <b>Please note</b> that you <b>must</b> have the correct Japanese vowel lengths. Many people email saying they cannot find words like "ronin", when they should have been trying "rounin". <li> long vowels in <em>gairaigo</em> must use a "-" to indicate the <em>chouon</em> character (〖). Thus it's "su-pa-", not "suupaa". <li> use an apostrophe (') to disambiguate things like hon'yaku and Shin'ichi. (Some IMEs use repeated n's for this.) <li> the small "tsu" (sokuon) is usually produced by repeating the consonant (e.g. socchoku). In the case of a sokuon before a "cho", a "t" can also be used (e.g. sotchoku). <li> for the voiced forms of "tsu" and "chi" use "dzu" and "dji". Thus you need to look up "tsudzuku", not "tsuzuku". </ul> Note that WWWJDIC can accept both Hepburn and kunrei/nihon shiki; both sin'iti and shin'ichi map to the same kana. Also, as in many IMEs, xa, xi, etc. can be used for the small kana vowels. <p>If you are entering KUN readings when looking up kanji, note that the fixed and inflecting portions are divided by a "." (in ASCII). Normally entering a "." in romaji will result in a JIS ".", so WWWJDIC lets you specify an ASCII "." by using a comma. Thus, use "a,u" or "ka,keru". Note this <b>only</b> applies to the kanji database. <p>For people who don't like having to click the "romanized Japanese" box on the dictionary search page, you enter romaji by prefixing the romaji with an "@" character (for hiragana), e.g. "@koujou", or a "#" character (for katakana), e.g. "#va-jon". In fact this is the only way you can input the odd katakana such as the small "ke" character or the "vu" character.</p> <h3>Exact Match</h3> An option on the Word Search page is "Require exact word-match", for non-Japanese search keys. If you select this option, only a restricted number of entries will be displayed, as one of the senses in the dictionary entry must match the key <b>exactly</b>, however two exceptions are made: <ol type="i"> <li>any characters in parentheses before the keyword are ignored; <li>the characters "to " preceding the keyword are ignored (thus allowing matches on English verbs). </ol> <h3>Searching for Japanese Words</h3> In general Japanese (and English) words can only be searched for from the <b>beginning</b> of the word. The only exception is when the search key begins with a kanji. In that case the match can occur anywhere in a word, however you may restrict it to occur at the beginning of the word. <h3>Searching for English Words</h3> You need to know that the dictionary files are based on Japanese head-words, and selecting entries using English keys can result in misleading results. For example, looking for "book" in the full EDICT file will return potentially 350 entries. For searching the EDICT file, you <em>may</em> be able to get better results by setting the common word restriction via the checkbox on the initial menu. Also using the "Exact Match" option, may improve the results. Checking the example sentences (if available) will help verify if the word is suitable. At all times the user should exercise caution. <p> The server has a list of variant English words and spellings, and if one of these is entered, it will suggest possible alternatives. So if you put in "favourite", it will suggest also looking at "favorite", if you put in "faucet", it will suggest "tap", etc. The suggestions are clickable links, so you can easily check out the suggestion. (The word list comes from the <a href="http://sourceforge.net/projects/wordlist/">VarCon</a> collection.) <p> Note that words of only one or two letters cannot be used as keys. This is to stop the dictionary index being filled with references to "if", "it", "of", "or", etc. A number of other common words such as "the" cannot be used as keys for the same reason. <h3>Searching for multiple words</h3> <p> A search can be be made using two words as the search key, e.g. "break out". In this case you will find all entries in which both words appear. The words can be a mixture of Japanese and English. For example searching for "こう high" will find entries where the reading starts with こう and where the English meaning contains "high". <p> Short phrases, etc. can be searched by using an underscore character between words, e.g. "break_out". In this case only entries the words appear in succession will be displayed. <h3>Kanji Colours</h3> In the regular dictionary display, the kanji are displayed in different colours according to their classification. The common jouyou (撅脱) kanji are in black, the jinmeiyou (客叹脱) kanji are in purple, and all others are in green. This feature can be disabled using the Customization feature, in which case all kanji will be black. <h3>Taskbar Search Buttons</h3> <p>Some small Javascript programs are available which enable text to be marked and then dropped straight into various lookup functions by clicking on a Taskbar button. Buttons are available for searching for Japanese or English words, and for using the Translate Words in Text function. See the <a href="http://nihongo.monash.edu/wwwbuttongen.html">button generator</a> page for details.</p> <h3>Multi-Radical Kanji Selection</h3> <p>The Multi-Radical Kanji Selection feature does <b>not</b> use the 214 classical radicals. Instead it uses a slightly different set which included more basic shapes. Note that the identification of the kanji is based on the visual appearance of the elements; not on their classical radical. </p> <h3>Customizing</h3> <p> You have the opportunity to change many of the visual aspects of WWWJDIC's input and display. There is a "customization" page which lets you change the basic colours, lines/display, etc. It also lets you change from the default EUC input and output coding to either Shift-JIS or Unicode (UTF-8). For users with modern browsers, Unicode (UTF-8) may be worth using as it avoids the use of bit-mapped images.</p> <p> The customization can take place either by setting a <em>cookie</em> in your browser, or by setting some URL parameters. Note that the cookies only work for the server which set them. <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="trans_tag">TRANSLATING TEXT</a></h2> One of the options of WWWJDIC is to translate the words in Japanese text. Please note, the function does <b>NOT</b> attempt to translate Japanese text into English; it simply sets out to identify the words in the text and to display the translations of those words. The user is expected to know enough Japanese grammar to make sense of the results. The input text is displayed in sections, with the words detected/translated in red, or in blue where an inflected verb or adjective is assumed. If a user requests that a word/phrase only be translated once (see below), the text is displayed in brown for subsequent occurrences. <p>You can use this option in two ways:</p> <ol type="A"> <li>cut-and-paste text from another application into the text box on the browser screen. (It usually seems to go automatically into the EUC required by WWWJDIC, but if you are having problems, try the option of forcing the server to convert it to EUC.) In some cases the cut-and-paste may break characters up, resulting in a load of mojibake. Sorry if this happens, but it's a browser problem and can't be fixed in the server.</li> <li>specify the URL of a WWW page, and the server will fetch that page and translate the words in it. Note that in doing so, it deletes everything between < and >, i.e. all HTML labels, etc. and as a default deletes all non-Japanese characters, so all you get is the raw Japanese. (You can override this and get it to leave the non-Japanese in if you wish.) Where non-Japanese has been deleted, a "|" is inserted. (In this option, you may wish to set a new timeout value if the fetch of the WWW page takes longer than the default 60 seconds allowed.) Please note that WWWJDIC makes no attempt to handle cookies. If you can't use this facility because the site you are viewing requires cookies enabled, you will have to use the cut-and-paste alternative. <p> Something you need to watch out for are URLs which don't actually point at the text you are seeing. Examples of this include text in a Frame. You need to give WWWJDIC the actual address of the frame - you can usually find this out from the browser if you right-click on the Frame text. </li> </ol> <p> The default is for the original text to be displayed one line at a time, followed by a list of translated words. For the "cut-and-paste" text, there is a "hidden translation" option, in which the word translations are embedded in the text and become visible when the mouse pointer is held over the word (this option only words with browsers supporting HTML 4.) <p>The server detects words in the text as follows:</p> <ol type="a"> <li>gairaigo in katakana are detected and looked up;</li> <li>jukugo beginning with kanji are detected;</li> <li>where a kanji is followed by two or more hiragana, an attempt is made to match the kana against known verb/adjective inflections. If this succeeds, the equivalent dictionary form of the word is sought. If this is successful, the match is displayed, and the matched text displayed in blue;</li> <li>single kanji which have not been detected in the above will be matched against dictionary entries (if any). (This may be turned off by the user.)</li> <li> sequences of four or more hiragana are matched against a small file of words and phrases typically written in kana alone. Only exact matches are reported. (This function may be expanded, but the possibility of false matches is high.) </li> <li> a special case is made of an <em>o</em> or <em>go</em> hiragana, or the <em>GO</em> kanji preceding a kanji. In this case a check is made to see if the word is present in the dictionary files with and without the prefix. </li> </ol> Matches against complete dictionary entries are favoured over partial matches of longer entries, and if two equivalent matches are found, the longer is returned. Matched jukugo which are followed by what appears to be a particle (i.e. "wa", "no", "ni", "na", etc.) are trimmed back to just the jukugo to avoid misreporting matches from phrases and similar long dictionary entries. <p>Users may request that translations only appear once for each Japanese word or phrase. <p> The user can invoke any dictionary file for the matching, but the combination GLOSSDIC file is the default, and is strongly recommended. (Note that using the main EDICT file in this function is not recommended, as its format is no longer fully compatible with the search system employed.) One advantage of using this combined file is that it increases the chance of getting a correct match for a word, particularly if the text contains names. Also, the component sub-files in GLOSSDIC are tagged, and the match function gives preference to entries in the following order (tags shown "EP", etc.):</p> <ul> <li> a small file of special words and phrases (SP);</li> <li> a subset of the most common 20,000 entries in the EDICT file (EP);</li> <li> from the rest of the EDICT file (ED);</li> <li> the other glossary files (PP, AV, CO, LW1, LW2, LS, FM, BU, GA);</li> <li> the ENAMDICT entries. (A special version of the file is used in which kanji names with multiple readings are combined into a single entry, with the most frequently used readings first);</li> </ul> <p> The reason the EDICT subset is used is so that the appropriate match is made when there are several readings of a jukugo, for example the "adult" compound will be matched against the word "otona" instead of the less common "dainin".</p> <p> The full details of all the dictionary files are provided below.</p> <p><b>Further Comments on WWW Page Translation</b></p> <p> Please note that if you are wanting to examine Japanese text within a frame, you may have to examine the source file (e.g. View/Source) to get the address of the actual file containing the text. An alternative is to open the frame in a window of its own.</p> <p>Please appreciate that the function is somewhat crude and simplistic. It can occasionally mis-parse long strings of kanji, so users are advised to examine the results carefully, especially where the text only partially matches the dictionary entry. There is a small <font color="red">[Partial Match]</font> when this occurs. <p> A large amount of text will result in hundreds of dictionary searches, so the server may take a while to respond.</p> <p> There is a <a href= "http://nihongo.monash.edu/wwwjtrans.html">front page</a> for this function which uses frames so you can have the viewed page and WWWJDIC side-by-side.</p> <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="dicfil_tag">DICTIONARY FILES</a></h2> <p>The dictionary files used by the server are:</p> <ul> <li>a composite <b>KANJI</b> dictionary file, which is used by the server for all the kanji search functions. The components are: <ul><li>the KANJIDIC file which contains comprehensive information about the most common 6,355 Japanese kanji as specified in the JIS X 0208-1990 set. </li> <li>the KANJD212 file of information about the 5,801 additional kanji in the JIS X 0212-1990 standard. It is in the same format as the KANJIDIC file. </li> </ul> See the <a href="http://www.edrdg.org/wiki/index.php/KANJIDIC_Project">documentation</a> of the KANJIDIC project. <p> <li>General Japanese-English Dictionary. <ul> <li> The server uses the <b>EDICT</b> file, which is the outcome of a voluntary project to produce a freely-available Japanese/English Dictionary in machine-readable form. This project has been under way since early 1991, and has involved hundreds of people. It now has over 190,000 entries, and is the major freeware Japanese-English lexicon. (There is a <a href= "http://www.edrdg.org/jmdict/edict.html">summary page</a> about the file, as well as the <a href= "http://www.edrdg.org/wiki/index.php/JMdict-EDICT_Dictionary_Project">full documentation</a>.) The version of the EDICT file used by the WWWJDIC server is the "EDICT2" extended version, in which kanji and reading variants are held within the single entry instead of being in separate entries. It also contains a subfile of the conjugations/inflections of about 2,000 common Japanese verbs, enabling pointers back to the dictionary form of the verbs. </li> <li> In addition, the server also has the Japanese WordNet installed as an optional dictionary. This has 158,000 entries. See the <a href="http://compling.hss.ntu.edu.sg/wnja/index.en.html">licence</a> and additional information. If you want to comment on entries in this dictionary, use the contacts listed on the WWW site above, and include the "synset" number at the end of the entries. Note that the readings in this dictionary have been generated using the MeCab morphological analysis program along with the UniDic lexicon. The results are not always accurate. </li> </ul> <p> <li>Subject-Specific Dictionary Files. <ul> <li>The <b>Japanese Names (ENAMDICT)</b> file which contains Japanese proper names; place-names, surnames and given names. The basic format of the ENAMDICT file is the same as the EDICT file, however for ease of use, the version used by the WWWJDIC server has been modified to include all the possible readings of a name within the one entry, with the readings approximately in frequency-of-use order. (<a href= "http://nihongo.monash.edu/enamdict_doc.html">Full documentation</a>.)</li> <li>The <b>Computing/Telecomms (COMPDIC)</b> file, which contains terms used in the computing and (tele)communications industries. (<a href= "http://ftp.edrdg.org/pub/Nihongo/compdic_doc.html">Full documentation</a>.) In June 2008 the entries in this file were combined with the full JMdict/EDICT file, and the terms or senses specific to information technology and telecommunications were tagged with "{comp}". The file is now simply a subset of the main EDICT file.</li> <li>The <b>Life Sciences (LIFSCIDIC)</b> file, which is the Japanese-English Life Science dictionary in the EDICT format. (April 2018) This dictionary contains over 130,000 Japanese bio-medical words frequently used in Life Science publications. The LSD was compiled by the <a href="https://lsd-project.jp/cgi-bin/lsdproj/ejlookup04.pl">Life Science Dictionary Project</a>, led by Professor Shuji Kaneko at Kyoto University. (See the <a href="https://lsd-project.jp/en/project/about.html">project overview</a>. <li>The <b>(Finance/Marketing (FINMKTDIC)</b> file, which is a concatenation of Kevin Seaver's glossary of financial terms (FINDIC), and Adam Rice's business & marketing glossary (MKTDIC). (Documentation files: <a href= "http://ftp.edrdg.org/pub/Nihongo/findic.doc">here</a> and <a href="http://ftp.edrdg.org/pub/Nihongo/mktdic.doc"> here</a>.)</li> <li>The <b>Linguistics (LINGDIC)</b> file compiled by Francis Bond in 1998, recently updated by Francis and by Paul Blay (<a href= "http://ftp.edrdg.org/pub/Nihongo/lingdic.txt">documentation</a>). <li> The <b>Legal Terms (LAWDIC)</b> file, which is glossary of legal terminology, currently containing about 6,000 terms. It is a concatenation of two glossaries: <ul> <li>LAWDIC1 (LW1) - the EDICT-format version of the Japanese Legal Glossary compiled by the Asian Law Program, School of Law, University of Washington. It was transcribed to file by a team of volunteers in 1995. (<a href= "http://ftp.edrdg.org/pub/Nihongo/lawgldoc.euc">documentation</a>.)</li> <li>LAWDIC2 (LW2) - the EDICT-format version of the "Standard Legal Terms Dictionary (2018)" produced by the Japanese Cabinet Secretariat <a href="http://www.japaneselawtranslation.go.jp/dict/download?re=02">website</a>. Where an entry is in both files, the SBD version is included.</li> </ul> <li>The <b>Buddhism (BUDDHDIC)</b> file - an extract of about 58,000 entries from the <a href="http://www.buddhism-dict.net/ddb/">Digital Dictionary of Buddhism (DDB)</a>. When using this file to look up words, you have the option of linking to the related entry in the full DDB. Note that you have to enter the login name "guest" (no password), and you are limited to 10 DDB accesses per 24-hour period. <li>The <b>Engineering/Science (ENGSCIDIC)</b> file - a 14,000 entry file of words mostly relating to engineering and science, which became available in October 2001. (<a href="http://nihongo.monash.edu/engscidich.html">Full documentation</a>.) <li>The <b>River & Water Systems (RIVERWATER)</b> file - a version of the River and Water Resources Glossary produced by the Infrastructure Development Institute - Japan. See my <a href="http://nihongo.monash.edu/riverwater_doc.html">short description</a>. </li> <li>The <b>Automobile Industry (CARDIC)</b> file - a version of K. Tomita's <a href="http://homepage3.nifty.com/k_tomita~s/romarin/data/car_dic.htm">Car_Dic</a> file. <li>The <b>Work-in-progress file (WIPFILE) </b> is a set of dictionary entries which have been collected from various sources. These are entries which are not yet in the main EDICT file. They are in the process of being confirmed and edited into the main file. (This file incorporates a set of extra loanwords (GAIDIC)) previously used by this server.</li> <li>The <b>(Miscellaneous (MISCDIC)</b> file, which is a concatenation of several small glossary files. These have been merged, and the entries have been given two-letter tags to show their source. <ul> <li>GEODIC (GE) - geological terminology file compiled by Bruce Bain and Leslie Oberman. (<a href= "http://ftp.edrdg.org/pub/Nihongo/geodic.doc">documentation</a>)</li> <li>PANDPDIC (PP) - Jim Minor's Pulp & Paper Industry Glossary file. (<a href= "http://ftp.edrdg.org/pub/Nihongo/pandpgls.doc">documentation</a>)</li> <li>AVIATION (AV) - Ron Schei's Aviation Dictionary File (<a href= "http://ftp.edrdg.org/pub/Nihongo/aviation.txt">documentation</a>)</li> <li>CONCRETE (CO) - Gururaj Rao's Concrete Terminology Glossary (<a href= "http://ftp.edrdg.org/pub/Nihongo/concrete.doc">documentation</a>)</li> <li>STARDICT (ST) - a list of star and constellation names prepared by Raphael Garrouty in 2001.</li> <li>FORSDIC_E (FO) - a list of forestry terms compiled by Juan Cardona (<a href="http://ftp.edrdg.org/pub/Nihongo/forsdic.txt">documentation</a>). <li>ENVGLOSS (EV) - a short glossary of environmental terms (<a href="http://ftp.edrdg.org/pub/Nihongo/envgloss.inf">documentation</a>)</li> <li>MANUFDIC (MA) - a short glossary of manufacturing terms (<a href="http://ftp.edrdg.org/pub/Nihongo/manufinf">documentation</a>)</li> <li> FLOWER_PLANTS (FP) - a short list of flower and plant names compiled by Clemente Beghi. </ul> </ul> <p> <li><a name="dicfilf_tag">Japanese-Other Language Dictionary Files</a>. <ul> <li>The <b>Japanese-German (JDDICT)</b> file, which is a version of the the <a href="http://www.wadoku.de/wadoku/">WaDokuJT</a> Japanese-German dictionary file compiled by Ulrich Apel. (Jan 2018 download. 336,000 entries including names in EDICT2 format)</li> <li>The <b>Japanese-Russian (Warodai)</b> file (121,000 entries). This major Japanese-Russian dictionary is based on a digitized version of "Большой японско-русский словарь" (БЯРС) compiled by N.I.Konrad, S.V.Neverov, K.A.Popov, N.A.Syromyatnikov, M.S.Tsin and V.M.Konstantinov. It was first printed in 1970. The current project site for Warodai is <a href="https://www.warodai.ru/lookup/index.php">here</a>. <!-- and an earlier site is <a href="http://warodai.ru">here</a>. --> The conversion of the Warodai file into the EDICT format has kindly been carried out by Vitaly Zagrebelny. Vitaly's own online dictionary site at <a href="http://www.jardic.ru/">http://www.jardic.ru/</a> is well worth a visit. </li> <li>The <b>Japanese-French (J-FRENCH)</b> file of 15,600 entries uses the following sources: <ul> <li> 11,400 entries from the Japanese-French dictionary file from the <a href="http://dico.fj.free.fr/">Dictionnaire français-japonais</a> project being undertaken by Jean-Marc Desperrier. As Jean-Marc says on that page, his project's aim "est de traduire en français une partie du dictionnaire japonais-anglais Edict de Jim Breen". His project has been dormant now for several years. <li> 4,200 entries from the French translations in the <a href="https://www.transifex.com/gnurou/jmdict-i18n/dashboard/">JMdict Internationalization project</a>. This project is ongoing. </ul> <!-- (I used also to include about 41,000 entries from a dictionary compiled by <a href="http://francais.sourceforge.jp/">le projet francais pour francophone</a>. This file appears to be based around translating the EDICT file. There is some evidence that this file used translations generated by an online resource such as Babelfish, and as the quality of the French translations was low, those entries have now been withdrawn.) --> <li> The <b>Japanese-Swedish (JSVEDIC)</b> file, which is the result of a project to use recently developed techniques to reliably translate EDICT entries into Swedish. (See a <a href="http://www.nada.kth.se/~jsh/publications/jlex.pdf">report</a> on the project, the resulting file of which has been considerably edited to improve the quality of the translations. The file is available from the <a href="http://www.japanska.se/">Japanska.se</a> site.</li> <li>The <b>Japanese-Hungarian (JHUNGDIC)</b> file. This 48,000-entry dictionary file was compiled by Istvan Varga. Istvan compiled this dictionary by matching EDICT with an English-Hungarian dictionary, using some advanced NLP techniques. See his interesting <a href="http://www.mt-archive.info/MTS-2007-Varga.pdf">paper</a> on the process. Istvan has now published some <a href="http://www.vargamakai.com/az.html">Japanese-Hungarian dictionaries</a>. <li>The <b>Japanese-Spanish (JSPANISH)</b> dictionary file (41,000 entries) . This is a combination of two dictionaries: <ol type="a"> <li>the 26,000 entry "HISPADIC" Japanese-Spanish dictionary from that wonderful collaborative project (<a href="http://hispadic.byethost3.com/">WWW site</a>, which also has a search facility). <li>the 20,000 entry "RUI" Japanese-Spanish dictionary compiled by Francisco Barberan and included with his permission. The dictionary's <a href="http://www.nichiza.com/index.php?len=0&it=6">home page</a> also has an online search facility. </ol> <li> The <b>Japanese-Dutch (JDUTCH)</b> dictionary file. About 60,000 entries, this file has been generated from the <a href="http://japansnederlandswoordenboek.org/index.php/Hoofdpagina">Waran Jiten project</a> wiki at the Katholieke Universiteit Leuven in Belgium. Current version is from August 2021. <li> The <b>Japanese-Slovenian (JASLOV)</b> dictionary file of about 10,000 entries. Thanks to Kristina Hmeljak and Tomaz Erjavec for providing access to the file. <li> The <b>Japanese-Italian (JITALIAN)</b> dictionary file. About 39,000 entries (at Jan. 2016). This file is the Beta release evolved from ITADICT project at Ca' Foscari University of Venice coordinated by A. Mantelli and M. Mariotti. See the project site (<a href="https://a4edu.unive.it/ita/aboutus">https://a4edu.unive.it/ita/aboutus</a>) for more information. <!-- <li>The <b>Japanese-Russian (J-RUSSIAN)</b> file - a small Japanese-Russian dictionary file being compiled by Oleg Volkov. See Oleg's <a href="http://ftp.edrdg.org/pub/Nihongo/jr-edict.doc.rus.win1251.txt">documentation</a> (in Russian).</li> --> </ul> <p> <li>Glossing, etc. Dictionary Files. <ul> <li>The <b>Special Text-glossing/Combined Jpn-Eng (GLOSSDIC/THE_LOT)</b> files - a combination of most of the above files. (See earlier section on Translating Text.) The GLOSSDIC file is used for text glossing. When this file is generated, duplicated entries are removed, retaining the entry from the highest-ranking source. The entries are tagged to indicate the source dictionary file. An extended version (GLOSSDICX) containing the WIPFILE entries as well is available as an option. The THE_LOT file is simply a concatenation of the other files, and can be useful for wide-ranging dictionary searches, however it can lead to multiple results.</li> <li> The <b>Untranslated (REVHENKAN)</b> file - a collection of 130,000 words from the conversion files of various Input Methods. These words are <b>not</b> in the files above, and have no English meanings (as yet). They are available in case anyone wants to find out the reading of a word which is not in the main dictionary files. They are also useful as templates for new entries. Note that readings may not be always correct, as some IME systems include common mis-spellings and guesses. Also in the case of many verbs and adjectives the roots alone are included - this should not be taken to mean they are valid words in Japanese.</li> <li> A small file of words and phrases written in hiragana. These are mostly drawn from the EDICT file, and are used only when translating words in text. </li> </ul> </ul> <h3>Character Display</h3> Some of the dictionary files contain characters used in languages such as French, German, Russian, Sanskrit, etc., which are not available in the common JIS X 0208 character set. These characters are coded in the extension set - JIS X 0212 - however most browsers cannot display these characters correctly in the default EUC-JP coding, and they are not available at all in Shift-JIS coding. For this reason <ul><li> if you are using WWWJDIC in either EUC or Shift_JIS coding the characters are sent from the server either as HTML entities, e.g. &eacute; for é, or as bit-mapped PNG images. Depending on the font you have chosen for your browser, these characters may appear a little strange. <li> if you are using UTF8 (Unicode) coding, the actual character codes are sent from the server and the display will be according to the fonts you have available with your browser. </ul> <p> Please note that the dictionary material is for the most part copyright. Publication of material from WWWJDIC is permitted, provided appropriate acknowledgements are made. See the <a href="#copyr_tag">Copyright</a> section below for more information on this. <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="mult_tag">MULTI-RADICAL KANJI SELECTION</a></h2> The Multi-Radical Kanji Selection enables you to search for a kanji using the component "shapes" within the kanji. Each of the 12,356 kanji in the JIS X 0208 and JIS X 0212 standards has been analyzed and their components classified according to a set of 250 basic shapes. These shapes correspond approximately to the 214 "KiangXi" or classical radicals used by many kanji dictionaries, however a number of other common shapes such as 矣 and ユ are also used. <p> You may need to experiment with this function to get used to identifying the components of a kanji. Note that some components are further subdivided, e.g. the kanji 厦 is classified by the shapes: 庚, 厘 and 咐. <p> This function uses the "radkfile" file, which contains the radical-element breakdown for the JIS kanji. The JIS X 0208 file was originally prepared by Michael Raine and revised and extended by Jim Breen, and the JIS X 0212 file was prepared by Jim Rose. These files are used to drive the multi-radical kanji-selection feature. (If you want a copy of the files, the current versions are <a href="http://ftp.edrdg.org/pub/Nihongo/kradzip.zip">here</a>.) The files are inversions of the kanji-radical source files. <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="examp_tag">EXAMPLE SENTENCES</a></h2> The WWWJDIC server includes a large file of Japanese/English sentences which have been linked to the EDICT dictionary file so that sentences can be displayed by clicking on the "Ex" tag after the entry. In addition, a number of sentences have been identified as suitable examples for particular entries, and are displayed whenever the entry is shown. The sentence file can be also be searched, and there is a mechanism for submitting corrections online. <p> The examples are mostly drawn from the <a href="http://www.edrdg.org/wiki/index.php/Tanaka_Corpus">Tanaka Corpus</a>, a collection of Japanese/English sentences initially compiled by Professor Yasuhito Tanaka at Hyogo University and his students. The original sentences appear to be mostly from educational material, text books, etc. The collection was placed in the Public Domain by Professor Tanaka, and has since been placed in a Creative Commons "CC-BY" licence. </P> <P> The collection is large (approximately 150,000 pairs) and is being edited as there are a number of errors and duplications in both the Japanese and English texts. A number of additional sentences have been added to provide examples of word usage. </P> Any suggested corrections or sentences to add to the collection are welcome, and should be submitted using the Suggestion/Comment option on the page displaying the sentences. This will link you to the <a href="http://tatoeba.org/eng/">Tatoeba Project</a>, where the sentences are now maintained. <P> If you would like to download a complete copy of the current file of example sentences, including the index words, it is available via <a href="http://nihongo.monash.edu/examples.gz">http</a>. (<a href="http://nihongo.monash.edu/examples_date">Date</a> of the most recent version.) A <a href="http://nihongo.monash.edu/examples_s.gz">subset</a> file which is only about 30% the size of the full file is also available. <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="verb_tag">VERB CONJUGATIONS</a></h2> Most of the verbs in the main EDICT file allow an optional display of a table of verb conjugations. Where this is available, a <b>[V]</b> tag appears to the right of the verb display.</P> <P> The table of conjugations is generated automatically according to the part-of-speech tag in the entry. It should not be assumed that for every verb, any single conjugation is as frequently used or as natural as any other. </P> <P> Associated with the table of conjugations is a page of <a href="http://nihongo.monash.edu/wwwverbinf.html">supplementary comments</a> which attempts to expand some of the more obscure points. </P> <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="sod_tag">STROKE ORDER DIAGRAMS</a></h2> <h3>Jack Halpern's Diagrams</h3> Associated with the most common 2,200 kanji (i.e. the Jouyou and pre-2004 Jinmeiyou kanji) are animated Stroke Order Diagrams. Where these are available, an image of a brush will appear at the end of the information display for a kanji (example of a <a href= "http://nihongo.monash.edu/cgi-bin/wwwjdic?1MKU6F22">kanji</a> with a <a href="http://nihongo.monash.edu/cgi-bin/wwwjdic?160657_%B4%C1">diagram</a>) <P> The images used in this animation are the art-work from the New Japanese-English Character Dictionary (see <a href="http://www.kanji.org/">http://www.kanji.org/</a>), and are used with the kind permission of Mr Jack Halpern. They were scanned and cleaned up by Jeffrey Friedl to go into Jack's Kanji Learner's Dictionary. <P> The Stroke Order Diagram animation was carried out as follows: <ol> <li> the source of the diagrams is the digitized multi-panel form from the printed kanji dictionaries, in which the kanji is built up stroke by stroke. Jack Halpern provided these as BMP files. <li> each panel of the diagram was extracted into a separate file using a combination of a special utility program and the <em>bmptopnm</em> and <em>ppmtogif</em> utilities. <li> for each kanji, the <em>gifsicle</em> utility was used to make an animated GIF of the whole kanji. Some twitch a bit due the occasional alignment inaccuracies. </ol> All this took a bit of debugging, but once it was working, it only took a few minutes to generate the diagrams for the whole 2200 kanji. All this was done on a Sun system running Solaris, so the GIF files are quite legal under the Unisys patent. <p> <h3>Jim Rose''s Diagrams</h3> In addition, a further set of animated Stroke Order Diagrams are available from Jim Rose's <a href="http://www.kanjicafe.com/using_soder.htm">SODER</a> initiative at www.kanjicafe.com. (<a href="http://www.kanjicafe.com/license.htm">licence</a>) Where available, these are indicated by a second brush image. <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="links_tag">LINKS TO OTHER SYSTEMS</a></h2> <P> An interesting feature of WWWJDIC is the system of links to other servers and files. These are: <ol type="a"> <li> to other WWW kanji/hanzi/hanja character dictionaries. These links go from the kanji information page, and enable direct access to the information about that kanji held on other databases. The databases currently linked are: <ul> <li>Charles Muller's <a href= "http://www.buddhism-dict.net/dealt/">World Wide Web CJK-English Dictionary Database</a>.<br> This database contains a wealth of information, with a particularly classical emphasis. A feature is an index into his dictionary of Buddhist terms.</li> <li>Rick Harbaugh's <a href="http://www.zhongwen.com/">Zhongwen Zipu (Etymological Chinese-English Dictionary)</a>.<br> This is a fascinating dictionary (available as a CD-ROM too), with a wealth of etymological information about the characters, including a genealogical chart. It has a specifically Chinese orientation.</li> <!-- <li>Christian Wittern's <a href= "http://www.kb.oas.hist.uni-goettingen.de/cgi-win-d/kbwww.exe/"> KanjiBase</a> WWW character dictionary. This is under development, and carries a wealth of information from Christian's extensive collection.</li> --> <li>Timothy Huang's Big5 Database. This is a file of codes and related information in the Big5 set of hanzi compiled by Professor Timothy Huang, co-author of the book "An Introduction to Chinese, Japanese & Korean Computing". For further information, contact Tim on <a href="mailto:timd_huang@formac.com.tw"> timd_huang@formac.com.tw</a>.</li> </ul> <p> The "unifying" code we use to implement these links is the Unicode (UCS2) code-point. We intend to have all the systems cross-linked. You can index from Chuck's and Rick's systems back to WWWJDIC. </p> <li> the <a href="http://www.jekai.org/">jeKai Project</a>. This project is developing a WWW-based dictionary of extended information about words & phrases in Japanese. WWWJDIC examines the jeKai index and when it displays a Japanese word which is in the jeKai files, it creates a link. [jeKai] <li> the online Sanseido dictionary at <a href="http://dictionary.goo.ne.jp/">Goo</a>. The link goes from the normal word display, and triggers the JE server at that site. You can use the other dictionaries at that site, including the big Daijirin. [S] <li> the Google search engine, which is called with the displayed Japanese word(s) as a search key. The "images" option can also be used. [G] and [GI] <li> the Eijiro dictionary at the ALC server in Japan. [A] <li> the Japanese Wikipedia. WWWJDIC maintains a list of all article headings in the Japanese Wikipedia, and where an article is available for a displayed dictionary entry, a link is provided. [W] <li> the <a href="http://compling.hss.ntu.edu.sg/wnja/index.en.html">Japanese WordNet</a> now at NTU in Singapore. As with the Japanese Wikipedia, WWWJDIC maintains a list of all words in the Japanese WordNet, and provides a link when a displayed entry matches. [JW] </ol> <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <H2><a name="newsub_tag">SUBMITTING AMENDMENTS AND NEW ENTRIES</a></H2> Users of WWWJDIC are welcome to submit amendments to the dictionary files, and also to submit new entries via the online dictionary database. There is a drop-down menu after each entry labelled "[Links]" and options on that menu will take you to the database view/edit pages for the entry. Use the "New Entry" link at the top of one of those pages. <p> There is a page of <a href="http://nihongo.monash.edu/wwwnewwordexpl.html">basic advice</a> about submitting an entry, and you should also read the <a href="http://www.edrdg.org/wiki/index.php/Editorial_policy">editorial policy</a> page on the <a href="http://www.edrdg.org/wiki/index.php/Main_Page">EDRDG Wiki</a>. <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <H2><a name="multling_tag">JAPANESE INTERFACE</a></H2> Late in 2007 work began on modifying the server code and building parallel message tables so that users could opt for either English or Japanese as the language of the server interface. A major set of messages were translated in July/August 2008. At this stage many of the server functions were available entirely in Japanese. <p>At present there are two mthods available for setting the language: <ul> <li>via a cookie setting in your browser. The Japanese/English selection is an option on the Customize Page, and the language preference is set in the customization cookie. There is also a link on the front page which selects the language and resets the cookie. <li>via the URL. If you use a URL with "wwwjdicj" instead of "wwwjdic" (or "wwwjdicj.cgi" instead of "wwwjdic.cgi"), you will always get the Japanese version. </ul> <p> The following people have made major contributions to the provision of Japanese messages in the server interface: <ul> <li>muchan <li>Paul Blay <li>Kouji Ueshiba </ul> <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <H2><a name="code_tag">ABBREVIATIONS AND CODES USED IN DICTIONARY ENTRIES</a></H2> <P> The dictionary entries contain a number of abbreviations and codes, mainly to reduce storage usage and display space. (<a href="http://www.edrdg.org/jmdictdb/cgi-bin/edhelp.py?svc=jmdict&sid=#kw_rinf">Full list of codes</a>. These are in sections, so you may need to scroll down.) </P> <h3>Kanji and Reading Codes</h3> Note that more codes have been added recently. The full current list is <a href="https://www.edrdg.org/jmdictdb/cgi-bin/edhelp.py?svc=jmdict&sid=#kw_rinf">here</a>. <CENTER><TABLE BORDER> <TR> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> </TR><TR> <TD><B>ateji </B></TD> <TD>ateji (phonetic) reading></TD> <TD><B>ik </B></TD> <TD>word containing irregular kana usage></TD> <TD><B>iK </B></TD> <TD>word containing irregular kanji usage></TD> <TD><B>io </B></TD> <TD>irregular okurigana usage></TD> </TR><TR> <TD><B>oK </B></TD> <TD>word containing out-dated kanji or kanji usage></TD> <TD><B>rK </B></TD> <TD>rarely-used kanji form></TD> <TD><B>gikun </B></TD> <TD>gikun (meaning as reading) or jukujikun (special kanji reading)></TD> <TD><B>ik </B></TD> <TD>word containing irregular kana usage></TD> </TR><TR> <TD><B>ok </B></TD> <TD>out-dated or obsolete kana usage></TD> <TD><B>uK </B></TD> <TD>word usually written using kanji alone</TD> <TD><B> -</B></TD> <TD> -</TD> <TD><B> -</B></TD> <TD> -</TD> </TR></TABLE></CENTER> <h3>Part-of-Speech (POS) Codes</h3> Note that more codes have been added recently. The full current list is <a href="https://www.edrdg.org/jmdictdb/cgi-bin/edhelp.py?svc=jmdict&sid=#kw_pos">here</a>. <CENTER><TABLE BORDER> <TR> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> </TR><TR> <TD><B> adj-i </B></TD> <TD>adjective (keiyoushi)</TD> <TD><B> adj-kari </B></TD> <TD>`kari' adjective (archaic)</TD> <TD><B> adj-ku</B></TD> <TD>`ku' adjective (archaic)</TD> <TD><B> adj-f</B></TD> <TD>noun, verb, etc. acting prenominally (incl. rentaikei)</TD> </TR><TR> <TD><B> adj-na</B></TD> <TD>adjectival nouns or quasi-adjectives (keiyoudoushi)</TD> <TD><B> adj-nari</B></TD> <TD>archaic/formal form of na-adjective</TD> <TD><B> adj-no </B></TD> <TD>nouns which may take the genitive case particle "no"</TD> <TD><B> adj-pn</B></TD> <TD>pre-noun adjectival (rentaishi)</TD> </TR><TR> <TD><B> adj-shiku</B></TD> <TD>`shiku' adjective (archaic) </TD> <TD><B> adj-t</B></TD> <TD>`taru' adjective </TD> <TD><B> adv </B></TD> <TD>adverb (fukushi)</TD> <TD><B> adv-to </B></TD> <TD>adverb (with particle "to")</TD> </TR><TR> <TD><B> aux </B></TD> <TD>auxiliary</TD> <TD><B> aux-v </B></TD> <TD>auxiliary verb</TD> <TD><B> conj</B></TD> <TD>conjunction</TD> <TD><B> ctr</B></TD> <TD>counter</TD> </TR><TR> <TD><B> exp</B></TD> <TD>Expressions (phrases, clauses, etc.)</TD> <TD><B> id </B></TD> <TD>idiomatic expression </TD> <TD><B> int</B></TD> <TD>interjection (kandoushi)</TD> <TD><B> n</B></TD> <TD>noun (common) (futsuumeishi)</TD> </TR><TR> <TD><B> n-p</B></TD> <TD>proper noun</TD> <TD><B> n-adv</B></TD> <TD>adverbial noun (fukushitekimeishi)</TD> <TD><B> n-t</B></TD> <TD>noun (temporal) (jisoumeishi)</TD> <TD><B> pn</B></TD> <TD>pronoun</TD> </TR><TR> <TD><B> prt</B></TD> <TD>particle</TD> <TD><B> pref </B></TD> <TD>prefix </TD> <TD><B> suf </B></TD> <TD>suffix </TD> <TD><B> v1</B></TD> <TD>Ichidan verb</TD> </TR><TR> <TD><B> v2a-s, v2k-k, etc.</B></TD> <TD>Nidan verb (lower/upper) with 'u', `ku', etc. endings (archaic)</TD> <TD><B> v4k, v4r, etc.</B></TD><TD>Yodan verb with `ku', `ru', etc. endings (archaic)</TD> <TD><B> v5u, v5k, etc.</B></TD> <TD>Godan verb with `u', `ku', etc. endings</TD> <TD><B> v5k-s</B></TD> <TD>Godan verb - Iku/Yuku special class</TD> </TR><TR> <TD><B> v5aru</B></TD> <TD>Godan verb - -aru special class</TD> <TD><B> vi </B></TD> <TD>intransitive verb </TD> <TD><B> vs </B></TD> <TD>noun or participle which takes the aux. verb suru</TD> <TD><B> vs-c </B></TD> <TD>su verb - precursor to the modern suru</TD> </TR><TR> <TD><B> vs-i </B></TD> <TD>expression using the aux. verb suru(*)</TD> <TD><B> vs-s</B></TD> <TD>suru verb - special class</TD> <TD><B> vk</B></TD> <TD>Kuru verb - special class</TD> <TD><B> vt </B></TD> <TD>transitive verb</TD> </TR><TR> <TD><B> vz</B></TD> <TD>Ichidan verb - -zuru special class (alternative form of -jiru verbs)</TD> <TD><B> v-unspec</B></TD> <TD> verb - uspecified (usu. archaic)</TD> <TD><B> -</B></TD> <TD> -</TD> <TD><B> -</B></TD> <TD> -</TD> </TR></TABLE></CENTER> (*) This tag is also used for the する entry. It is primarily used to assist the verb conjugation table function in WWWJDIC. <h3>Miscellaneous Codes</h3> Note that more codes have been added recently. The full current list is <a href="https://www.edrdg.org/jmdictdb/cgi-bin/edhelp.py?svc=jmdict&sid=#kw_misc">here</a>. <CENTER><TABLE BORDER> <TR> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> </TR><TR> <TD><B> abbr </B></TD> <TD>abbreviation</TD> <TD><B>aphorism </B></TD><TD>aphorism (pithy saying) </TD> <TD><B> arch </B></TD> <TD>archaism</TD> <TD><B> chn </B></TD> <TD>children's language </TD> </TR><TR> <TD><B> col </B></TD> <TD>colloquialism </TD> <TD><B>dated </B></TD><TD>dated term </TD> <TD><B>derog </B></TD><TD>derogatory </TD> <TD><B>euph </B></TD><TD>euphemistic </TD> </TR><TR> <TD><B> fam </B></TD> <TD>familiar language </TD> <TD><B> fem </B></TD> <TD>female term or language</TD> <TD><B>form </B></TD><TD>formal or literary term </TD> <TD><B>hist </B></TD><TD>historical term </TD> </TR><TR> <TD><B> hon </B></TD> <TD>honorific or respectful (sonkeigo) language </TD> <TD><B> hum </B></TD> <TD>humble (kenjougo) language </TD> <TD><B>id </B></TD><TD>idiomatic expression </TD> <TD><B> joc</B></TD> <TD>jocular or humorous term</TD> </TR><TR> <TD><B> male </B></TD> <TD>male term or language</TD> <TD><B> m-sl </B></TD> <TD>manga slang</TD> <TD><B>net-sl </B></TD><TD>Internet slang </TD> <TD><B> obs </B></TD> <TD>obsolete term</TD> </TR><TR> <TD><B> obsc </B></TD> <TD>obscure term</TD> <TD><B> on-mim </B></TD> <TD>onomatopoeic or mimetic word</TD> <TD><B>poet </B></TD><TD>poetical term </TD> <TD><B> pol </B></TD> <TD>polite (teineigo) language </TD> </TR><TR> <TD><B>proverb </B></TD><TD>proverb </TD> <TD><B>quote </B></TD><TD>quotation </TD> <TD><B>rare </B></TD><TD>rare term </TD> <TD><B> sl </B></TD> <TD>slang</TD> </TR><TR> <TD><B> sens </B></TD> <TD>term with some sensitivity about its usage</TD> <TD><B> uk </B></TD> <TD>word usually written using kana alone </TD> <TD><B> vulg </B></TD> <TD>vulgar expression or word </TD> <TD><B> yoji </B></TD> <TD>four-character compound word (usu. idiomatic) </TD> </TR><TR> <TD><b> P</b></TD> <TD>"Priority" entry, i.e. among approx. 20,000 words deemed to be common in Japanese</TD> <TD><B> X</B></TD> <TD>rude or X-rated term (not displayed in educational software)</TD> <TD><B> -</B></TD> <TD> -</TD> <TD><B> -</B></TD> <TD> -</TD> </TR><TR> </TR></TABLE></CENTER> For more information about the P (Priority) markers, see the Word Priority Marking section in the <a href="http://www.edrdg.org/wiki/index.php/JMdict-EDICT_Dictionary_Project#Word_Priority_Marking">JMdict/EDICT</a> documentation. <P> <h3>Domain or Field Codes</h3> These indicate that the word or expression has particular application (but not necessarily exclusive application) in the specified domain. <br> Note that more codes have been added recently. The full current list is <a href="https://www.edrdg.org/jmdictdb/cgi-bin/edhelp.py?svc=jmdict&sid=#kw_fld">here</a>. <CENTER><TABLE WIDTH="100%" BORDER> <TR> <TD><B> CODE </B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> </TR> <TR> <TD><B>agric </B></TD><TD>agriculture </TD> <TD><B>anat </B></TD><TD>anatomy </TD> <TD><B>archeol </B></TD><TD>archeology </TD> <TD><B>archit </B></TD><TD>architecture </TD> </TR> <TR> <TD><B>art </B></TD><TD>art, aesthetics </TD> <TD><B>astron </B></TD><TD>astronomy </TD> <TD><B>audvid </B></TD><TD>audiovisual </TD> <TD><B>aviat </B></TD><TD>aviation </TD> </TR> <TR> <TD><B>baseb </B></TD><TD>baseball </TD> <TD><B>biochem </B></TD><TD>biochemistry </TD> <TD><B>biol </B></TD><TD>biology </TD> <TD><B>bot </B></TD><TD>botany </TD> </TR> <TR> <TD><B>Buddh </B></TD><TD>Buddhism </TD> <TD><B>bus </B></TD><TD>business </TD> <TD><B>cards </B></TD><TD>card games </TD> <TD><B>chem </B></TD><TD>chemistry </TD> </TR> <TR> <TD><B>Christn </B></TD><TD>Christianity </TD> <TD><B>cloth </B></TD><TD>clothing </TD> <TD><B>comp </B></TD><TD>computing </TD> <TD><B>cryst </B></TD><TD>crystallography </TD> </TR> <TR> <TD><B>dent </B></TD><TD>dentistry </TD> <TD><B>ecol </B></TD><TD>ecology </TD> <TD><B>econ </B></TD><TD>economics </TD> <TD><B>elec </B></TD><TD>electricity, elec. eng. </TD> </TR> <TR> <TD><B>electr </B></TD><TD>electronics </TD> <TD><B>embryo </B></TD><TD>embryology </TD> <TD><B>engr </B></TD><TD>engineering </TD> <TD><B>ent </B></TD><TD>entomology </TD> </TR> <TR> <TD><B>film </B></TD><TD>film </TD> <TD><B>finc </B></TD><TD>finance </TD> <TD><B>fish </B></TD><TD>fishing </TD> <TD><B>food </B></TD><TD>food, cooking </TD> </TR> <TR> <TD><B>gardn </B></TD><TD>gardening, horticulture </TD> <TD><B>genet </B></TD><TD>genetics </TD> <TD><B>geogr </B></TD><TD>geography </TD> <TD><B>geol </B></TD><TD>geology </TD> </TR> <TR> <TD><B>geom </B></TD><TD>geometry </TD> <TD><B>go </B></TD><TD>go (game) </TD> <TD><B>golf </B></TD><TD>golf </TD> <TD><B>gramm </B></TD><TD>grammar </TD> </TR> <TR> <TD><B>grmyth </B></TD><TD>Greek mythology </TD> <TD><B>hanaf </B></TD><TD>hanafuda </TD> <TD><B>horse </B></TD><TD>horse racing </TD> <TD><B>kabuki </B></TD><TD>kabuki </TD> </TR> <TR> <TD><B>law </B></TD><TD>law </TD> <TD><B>ling </B></TD><TD>linguistics </TD> <TD><B>logic </B></TD><TD>logic </TD> <TD><B>MA </B></TD><TD>martial arts </TD> </TR> <TR> <TD><B>mahj </B></TD><TD>mahjong </TD> <TD><B>manga </B></TD><TD>manga </TD> <TD><B>math </B></TD><TD>mathematics </TD> <TD><B>mech </B></TD><TD>mechanical engineering </TD> </TR> <TR> <TD><B>med </B></TD><TD>medicine </TD> <TD><B>met </B></TD><TD>meteorology </TD> <TD><B>mil </B></TD><TD>military </TD> <TD><B>mining </B></TD><TD>mining </TD> </TR> <TR> <TD><B>music </B></TD><TD>music </TD> <TD><B>noh </B></TD><TD>noh </TD> <TD><B>ornith </B></TD><TD>ornithology </TD> <TD><B>paleo </B></TD><TD>paleontology </TD> </TR> <TR> <TD><B>pathol </B></TD><TD>pathology </TD> <TD><B>pharm </B></TD><TD>pharmacology </TD> <TD><B>phil </B></TD><TD>philosophy </TD> <TD><B>photo </B></TD><TD>photography </TD> </TR> <TR> <TD><B>physics </B></TD><TD>physics </TD> <TD><B>physics </B></TD><TD>physics </TD> <TD><B>physiol </B></TD><TD>physiology </TD> <TD><B>politics </B></TD><TD>politics </TD> </TR> <TR> <TD><B>print </B></TD><TD>printing </TD> <TD><B>psy </B></TD><TD>psychiatry </TD> <TD><B>psyanal </B></TD><TD>psychoanalysis </TD> <TD><B>psych </B></TD><TD>psychology </TD> </TR> <TR> <TD><B>rail </B></TD><TD>railway </TD> <TD><B>rommyth </B></TD><TD>Roman mythology </TD> <TD><B>Shinto </B></TD><TD>Shinto </TD> <TD><B>shogi </B></TD><TD>shogi </TD> </TR> <TR> <TD><B>ski </B></TD><TD>skiing </TD> <TD><B>sports </B></TD><TD>sports </TD> <TD><B>stat </B></TD><TD>statistics </TD> <TD><B>stockm </B></TD><TD>stock market </TD> </TR> <TR> <TD><B>sumo </B></TD><TD>sumo </TD> <TD><B>telec </B></TD><TD>telecommunications </TD> <TD><B>tradem </B></TD><TD>trademark </TD> <TD><B>tv </B></TD><TD>television </TD> </TR> <TR> <TD><B>vidg </B></TD><TD>video games </TD> <TD><B>zool </B></TD><TD>zoology </TD> <TD><B> -</B></TD> <TD> -</TD> <TD><B> -</B></TD> <TD> -</TD> </TR></TABLE></CENTER> <h3>Names Dictionary Codes</h3> <CENTER><TABLE BORDER> <TR> <TD><B> CODE </B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> </TR> <TR> <TD><B> s</B></TD> <TD>surname</TD> <TD><B> p</B></TD> <TD>place-name</TD> <TD><B> u</B></TD> <TD>person name, as-yet unclassified</TD> <TD><B> g</B></TD> <TD>given name, as-yet not classified by sex</TD> </TR> <TR> <TD><B> f</B></TD> <TD>female given name</TD> <TD><B> m</B></TD> <TD>male given name</TD> <TD><B> h</B></TD> <TD>a full (family plus given) name of a historical person</TD> <TD><B> c</B></TD> <TD>company name</TD> </TR> <TR> <TD><B> o</B></TD> <TD>organization name</TD> <TD><B> pr</B></TD> <TD>product name</TD> <TD><B> st</B></TD> <TD>station name</TD> <TD>ch </B></TD>` <TD>character </TD> </TR> <TR> <TD>cr </B></TD> <TD>creature </TD> <TD>dei </B></TD> <TD>deity </TD> <TD>doc </B></TD> <TD>document </TD> <TD>ev </B></TD> <TD>event </TD> </TR> <TR> <TD>fic </B></TD> <TD>fiction </TD> <TD>group </B></TD> <TD>group </TD> <TD>leg </B></TD> <TD>legend </TD> <TD>myth </B></TD> <TD>mythology </TD> </TR> <TR> <TD>oth </B></TD> <TD>other </TD> <TD>rel </B></TD> <TD>religion</TD> <TD>serv </B></TD> <TD>service </TD> <TD>ship </B></TD> <TD>ship name </TD> </TR> <TR> <TD>wk </B></TD> <TD>work of art, literature, music, etc.</TD> <TD>-</TD> <TD>-</TD> <TD>-</TD> <TD>-</TD> <TD>-</TD> <TD>-</TD> </TR></TABLE></CENTER> <h3>Dictionary File Codes</h3> The THE_LOT and GLOSSDIC files have the following codes attached to each entry to show the dictionary file from which it has been selected. <CENTER><TABLE BORDER> <TR> <TD><B> CODE </B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> </TR><TR> <TD><B> AV</B></TD> <TD>aviation </TD> <td><B> BU</B></td> <td>buddhdic</td> <TD><B> CA</B></TD> <TD>cardic</TD> <TD><B> CC</B></TD> <TD>concrete</TD> </TR><TR> <TD><B> CO</B></TD> <TD>compdic</TD> <TD><B> ED</B></TD> <TD>edict (the rest)</TD> <TD><B> EP</B></TD> <TD>edict (priority subset)</TD> <td><B> ES</B></td> <td>engscidic</td> </TR><TR> <td><B> EV</B></td> <td>envgloss</td> <TD><B> FM</B></TD> <TD>finmktdic</TD> <td><B> FO</B></td> <td>forsdic_e</td> <TD><B> GE</B></TD> <TD>geodic </TD> </TR><TR> <TD><B> KD</B></TD> <TD>small hiragana dictionary for glossing </TD> <td><B> LG</B></td> <td>lingdic</td> <TD><B> LS</B></TD> <TD>lifscidic</TD> <TD><B> LW1/2</B></TD> <TD>lawdic1/2</TD> </TR><TR> <td><B> MA</B></td> <td>manufdic</td> <TD><B> NA</B></TD> <TD>enamdict</TD> <td><B> PL</B></td> <td>j_places (entries not already in enamdict)</td> <TD><B> PP</B></TD> <TD>pandpdic </TD> </TR><TR> <td><B> RH</B></td> <td>revhenkan (kanji/kana with no English translation yet)</td> <td><B> RW</B></td> <td>riverwater</td> <TD><B> SP</B></TD> <TD>special words & phrases</TD> <td><B> ST</B></td> <td>stardict</td> </TR><TR> <td><B> WI1/2</B></td> <td>wipfile (work-in-progress)</td> <td><B> -</B></td> <td>-</td> <td><B> -</B></td> <td>-</td> <td><B> -</B></td> <td>-</td> </TR> </TABLE></CENTER> <h3>Regional and Dialect Codes</h3> These tags indicate that a word or phrase is associated with a particular regional language variant within Japan. <CENTER><TABLE BORDER WIDTH="100%"> <TR> <TD><B> CODE </B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> <TD><B> CODE</B></TD> <TD><B> MEANING</B></TD> </TR><TR> <TD><B>hob</B></TD> <TD>Hokkaido</TD> <TD><B>ksb</B></TD> <TD>Kansai</TD> <td><B>ktb</B></td> <td>Kantou</td> <TD><B>kyb</B></TD> <TD>Kyouto</TD> </TR><TR> <TD><B>kyu</B></TD> <TD>Kyushu</TD> <TD><B>nab</B></TD> <TD>Nara</TD> <TD><B>osb</B></TD> <TD>Osaka</TD> <td><B>rkb</B></td> <td>Ryuukyuu</td> </TR><TR> <TD><B>thb</B></TD> <TD>Touhoku</TD> <TD><B>tsb</B></TD> <TD>Tosa</TD> <TD><B>tsug</B></TD> <TD>Tsugaru</TD> <td><B>-</B></td> <td>-</td> </TR> </TABLE></CENTER> <p> <h3>Kanji Dictionary Codes</h3> WWWJDIC uses the <em>KANJIDIC</em> file of kanji information. It has a system of letter codes in front of the various fields, e.g. "U798f B113 G3 S13 F467 ...". These are explained in the <a href="http://nihongo.monash.edu/kanjidic_doc.html#IREF03">full documentation</a> for that file. The "See an explanation ..." link below the kanji information display will give an expanded version of the fields. <h3>Others</h3> In addition to the codes above, for gairaigo which have not been derived from English words, the source language has been indicated using the three-letter codes from the ISO 639 "Code for the representation of names of languages" standard, e.g. ``(fre: avec)". <p>In entries which are Japanese idiomatic expressions, aphorisms, etc. the literal translation of the Japanese is sometimes shown in parentheses, preceded by "lit:". Also where the Japanese word has been constructed by transliteration of two or more foreign words or word fragments (e.g., a <em>waseieigo</em> - Japanese-made English), the source words are indicated by "wasei:". <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="copyr_tag">COPYRIGHT</a></h2> <a href="http://creativecommons.org/licenses/by-sa/3.0/"><img src="http://creativecommons.org/images/public/somerights20.gif" alt="CC-SA"></a> <br> The material being displayed in WWWJDIC's pages is copyright. Much of it is drawn from dictionary files the copyright of most of which is held by the <a href="http://www.edrdg.org/">Electronic Dictionary Research and Development Group</a> (EDRDG). Other material is associated with the WWWJDIC server and software. It is being made available under a Creative Commons <a HREF="http://creativecommons.org/licenses/by-sa/3.0/">Attribution-ShareAlike Licence</a> (V3.0) (<a href="http://creativecommons.org/licenses/by-sa/3.0/deed.ja">泣塑胳バ〖ジョン</a>). (Note that the Japanese-Dutch file has a no-commercial-use Creative Commons licence.)<p> What does this mean in practical terms? Well: <ol type="a"> <li> you can use WWWJDIC in the same way as you use a published dictionary to assist you with translating text and words. The results of your translation may be published, sold, etc. If you make heavy use of WWWJDIC it would be nice to acknowledge that, but there is no requirement to do more; <li> you can link to WWWJDIC, e.g. using the backdoor entry, from other servers, provided you acknowledge that use on your server, and provide links to WWWJDIC and its documentation. <li> if you wish to publish significant extracts of the output from WWWJDIC, for example if you use the Translate Words in Text function to generate a vocabulary list for a textbook of reading passages, then this comes under the scope of the licence for the dictionary files, which permits publication of subsets of the files. You must acknowledge the source of this information. Other information produced by the server, e.g. the verb conjugation tables, may be published but the source must be acknowledged. <li> the Stroke Order Diagrams are under either Jack Halpern's or Jim Rose's copyright. You may link to the pages displaying those images, but you must not download or reuse the images without their respective permissions. <li> the example sentences are from the <a href="http://nihongo.monash.edu/tanakacorpus.html">Tanaka Corpus</a> and are in the Public Domain; </ol> For more details, see the <a href="http://www.edrdg.org/edrdg/licence.html">licence</a> statement covering the dictionary files. <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="faq_tag">FAQ (Frequently Asked Questions)</a></h2> <b>Input</b> <ul> <li>[Q] I have been wondering if there is a way to include wildcards in a search using kanji keys. Often there is a bit in the middle of a word that is unclear to me.<br> [A] Well, you can search using one or more kanji sequences by making sure "Starting Kanji" is not selected. Then it will match on one or more kanji mid-word (if they are there). If you want to search on two non-adjacent kanji, put a space between them. (Then it will search on the first, and remove results that don't contain the second.) <li>[Q] Sometimes I see dictionary entries with words hyphenated or spaced and others that are not, e.g. "thumbtack" and "thumb tack". Is there any way I can ask for all these in a single lookup?<br> [A] Usually you can get them all by putting a space in your search key. For example, entering "thumb tack" will display entries containing "thumb tack", "thumbtack" and "thumb-tack". <li>[Q] It seems if I put in a reading of "sa" into the Find Kanji in the Database it only gets kanji with that exact reading. How can I get all the kanji with readings starting with "sa"?<br> [A] Put in "sa*" (either an ASCII * or a JIS one.) Note that it must be a <em>kana</em> search key for this to work. For romaji, put in something impossible, such as "sawi" - then it will match on everything starting with "sa". <li>[Q] I don't get any of the JIS X 0212 kanji when I specify a kanji selection.<br> [A] You need to click on the button to enable these (normally they are suppressed, as few users need them.)</li> <li>[Q] How do I specify a JIS X 0212 kanji when selecting a JIS code.<br> [A] Put an "h" in front of it, e.g. "h4064". ("h" is for hojo.)</li> </ul> <b>Display</b> <ul> <li>[Q] Why do some entries have the kana part first with the kanji following in <<...>>?<br> [A] That happens with entries that are tagged as "uk" (usually kana). Since the kanji forms are less common, it makes sense to display the more common kana forms first. <li>[Q] I can't read the kana readings. Will you add romaji display as an option.<br> [A] No. Better to learn kana. It will only take a week or two.</li> <li>[Q] Can't you arrange the order of the display so that when I ask for an English word I get the common ones first?<br> [A] Not very easily, as it would mean doing two passes over the entries, somehow keeping track where the server was up to, etc. Better that you select the "Restrict to common words" option at first, then if you don't get what you want, try again looking at all possible entries. Remember that it is really a Japanese-English dictionary, and you have to take your chances with English-Japanese.<br> <li>[Q] What are all those "vs" and "adj-na" tags on the dictionary displays? And what are the "ED" and "LS" when I translate words in text?<br> [A] Look at the link on the top of the page labelled "Dictionary-Codes". <br> <li>[Q] I have been looking at all the words containing a particular kanji, and a couple of entries seem to get displayed several times.<br> [A] Yes, that may happen if the kanji occurs more than once in the entry, which means that there is more than one index item pointing at the entry. The server will stop multiple displays on the one page, but can't detect them when they are spread across several pages. It is a bother, but fixing it would be very complicated. If you go to the customization page and increase the lines/page to 100 or so, the problem will probably go away. <li>[Q] When I look up kanji with the JIS 212 option turned on, the extra kanji show up as small graphics, which looks ugly. Is there any way I can get them in a better font?<br> [A] Go to the Customize page and select UTF8 operation. That will result in your local font being used for all kanji. <li>[Q] I understand the (P) on some entries means it is a common word. Why is the (P) sometimes attached to the kanji or reading?<br> [A] Usually the (P) is at the end of the entry. When there is more than one kanji headword and/or reading, the (P) is placed near the headword or reading which is the common one. <li>[Q] Some of the words marked (P) are actually not very common, and also there are some common words not marked. Why is this?<br> [A] The allocation of those (P) markers is based on a number of sources, and inevitably has some problems. (For more information about the markers, see the Word Priority Marking section in the <a href="http://www.edrdg.org/wiki/index.php/JMdict-EDICT_Dictionary_Project#Word_Priority_Marking">JMdict/EDICT</a> documentation.) If you see any that are dubious, or see entries that you think deserve a marker, please use the amendment form to suggest a change. <li>[Q] If I use the Special Graphic Interface, then paste some Japanese text into the Translate Words in Japanese Text function, it responds: "There didn't seem to be any Japanese in this text!".<br> [A] Using the "Access" portal and then putting in Japanese text will often cause a problem. If you can paste Japanese into a browser, you shouldn't need to use the portal. For the paste-text option to work, the page needs to be set to a Japanese code-set. The portal is set to work in ISO-8859-1, AKA Latin-1, since it assumes you can't use Japanese codes. </ul> <b>Keitai/Cellphones/Mobile phones</b> <ul> <li>[Q] What about WWWJDIC via a smartphone app?<br> [A] For Android phones there is a very good app, just called WWWJDIC, which uses your selected server via the API. I don't know what the situation is with iPhones. <br> Of course many of WWWJDIC's functions are available locally via apps such as AEdict and ImaWa. <li>[Q] I can't use WWWJDIC from a J-Phone. I put in a search word, but get no reply, instead it goes to the main menu.<br> [A] Yes, I hope to fix that eventually. J-Phones use MML not HTML, and for some reason forms are sending in information that can't be decoded. <li>[Q] Are you planning to have a WAP interface for WWWJDIC?<br> [A] Perhaps one day. </ul> <b>Translate Words in Text</b> <ul> <li>[Q] In the text word translation you don't do all the words written just in hiragana - why is that?<br> [A] There are several reasons: (a) the beginnings of such words can be very difficult to detect when they are preceded by other kana as is often the case (particles, etc.). You need sophisticated segmentation software to do this. (b) many Japanese words share the same reading/pronunciation, and hence I would probably pick the wrong word. <br> At present I only handle words which are at least 4 kana long and which are found in a small list of kana-only words. <li>[Q] Why do you just translate the words in the text? Why don't you go the rest of the way and translate properly into English? <br> [A] Machine Translation (MT) is a huge and complex task. The WWWJDIC server is comparatively simple. If I ever developed a Japanese-English MT system (most unlikely), I'd sell it; not have it free on a WWW site. </ul> <b>Running WWWJDIC locally</b> <ul> <li>[Q] I want to have WWWJDIC's functions on my PC without having to use an Internet connection. Is there a stand-alone version I can download?<br> [A] No, and I can't see it happening in a hurry, as the server software is very unlike what you'll find in a stand-alone PC program. There is no reason why the functionality can't be in a stand-alone program, and some programs such as JQuickTrans do a similar job. I have considered developing a stand-alone program which has similar functionality to WWWJDIC, but there are some problems: <ul> <li> I would need a suitable cross-platform environment, i.e. all of Unix/Linux, Macintosh and Windows. These are not simple or cheap. <li> it would be a huge software development task. I'm getting a bit old for such things. Also there is the overhead of installation software, all that extra documentation, answering users' question on why it won't install, run, etc. etc. Not to mention distributing new versions, updates, etc. <li> the dictionaries would fall out-of-date. The great thing about a server is that it always has the most up-to-date software and data. </ul> So don't hold you breath waiting for a local PC version of WWWJDIC. I have some information about stand-alone software on the <a href= "http://nihongo.monash.edu/edict.html">EDICT home page</a>. <li>[Q] I have hunted for the source of WWWJDIC and can't find it. Where is it?<br> [A] Locked up on the servers. I haven't released it, and at this stage have no intention of doing so. It is continually being modified, and I want to keep it under my control (after all, it is <b>my</b> ego trip). I don't want any clones of WWWJDIC running around at this stage. Also it is a vast slab on C program code; mostly undocumented. To release it would require a significant amount of installation, etc. documentation to be written. And I'd still be plagued with "I couldn't make it do XXXX" emails. </ul> <b>Miscellaneous</b> <ul> <li>[Q] I'd like to have links in WWW pages which do a WWWJDIC lookup for a particular word. Is there any way of having the keyword in the URL?<br> [A] At present the only way to do that is via the API, and it's a bit fiddly creating the correct URL. I realise this would be useful, so I've added an option for creating such URLs on the fly. At the bottom of the search results page look for "URL link generation". Clicking on that will create the right URL to use. You can select which dictionary you want to use too. <li>[Q] Can I use UTF8 with WWWJDIC?<br> [A] You can. If you go to the customize page, select UTF8 and allow for a cookie you can talk to WWWJDIC in UTF8. Moreover, it will send you back real text, not bitmaps, for the JIS212 kanji and the diacritics in the French/German/Buddhist files. <li>[Q] I like the Stroke Order Diagrams. Why do some kanji not have them?<br> [A] The initial raw diagrams were provided by Jack Halpern, and were prepared for the Kodansha Kanji Learners Dictionary. The coverage of that book is a bit over 2,000 kanji, so that's all the diagrams available from that source. The second set are from Jim Rose's SODER project. That's expanding, so one day there might be a lot more kanji available. <li>[Q] Some of the Stroke Order Diagrams don't really match the displayed kanji, e.g. 武. <br> [A] This is because the glyph (shape) of the written form is different. <li>[Q] I'm looking for a way to set the "common words" and "exact word match" checkboxes by default.<br> [A] At present the only way to do this is to save a copy of that page to your PC and edit it adding CHECKED to the <input ... > for each of the checkboxes. Eventually I hope to be able to make these sorts of things more configurable using cookies. <li>[Q] Your server is very slow. Why don't you rewrite it in ... or move it to the .... server technology?<br> [A] Actually the servers are <b>not</b> slow at all. They are all fast systems, and the code is quite light-weight. Most requests are served in a fraction of a second. To some users it may seem slow because of network delays and congestion. If this is your case, try using a mirror site closer to you. <li>[Q] How do a specify that I want my default dictionary to be "the_lot"? The customization doesn't allow that.<br> [A] I really should add that to the customization. In the meantime you can either (a) bookmark the dictionary search screen, then without your browser running edit the URL in the bookmark file to say "wwwjdic?MC", or (b) go to the initial dictionary search screen, change the "wwwjdic?1C" to "wwwjdic?MC", press enter to go to the "new" URL, then bookmark it. Note that "M" may change; see the question below.</li> <li>[Q] The BackDoor method relies on you knowing the letters that identify the dictionaries, but these seem to change. How can I tell which letter is used for a dictionary?<br> [A] Your browser will have a "View Page Source" somewhere. Use it on one of the pages with a drop-down menu for selecting the dictionary, and down the bottom of the page the menu HTML will show the letters as 'VALUE=".."'. </ul> <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="history">WWWJDIC HISTORY</a></h2> <em>(By Jim Breen)</em> <p>No sooner had the WWW come into being that servers accessing my dictionary files began to appear. The first, which operated briefly in 1994, was a slight rework of my <tt>xjdic</tt> program by Otfried Schwarzkopf. It overtaxed his 386, and was closed down fairly quickly, however by that stage Jeffrey Friedl's famous Dictionary engine was running. There was also Rafael Santos' system, the EVA/POETS engine at Notre Dame in Tokyo, PSP's ALISE-based system, etc. etc., as well as Lambert Schomaker's WWW edition of the KANJIDIC file. Most of these have faded away now.</p> <p>I had intended to have a WWW version of xjdic right from the moment I knew about the WWW, and in 1994 collected some information on writing CGI programs ready for the assault. It always seemed too big a task, and anyway Jeffrey's server was doing a good job. Eventually in mid-1997 it got too much for me, as I wanted to experiment with some features not handled by Jeffrey's server, and I also wanted to see <b>my</b> name in the WWW lights too, so I filleted out the search-engine parts of xjdic and dashed off a new CGI-oriented front-end. It only took a week or two of spare time and was up and running. I could easily have done it years before.</p> <p>WWWJDIC has proved popular, and has probably overtaken the early lead Jeffrey's server established. It has been relatively easy to modify, so I have tinkered with it quite a bit (see below.) In fact, it is now probably the major vehicle for me trying out things to do with Japanese dictionaries.</p> <p>Starting in late in 1998 I have installed a number of mirrors. The first two were quite a bit of work as I had effectively written a lot of hard-coded stuff pointing at the Monash site. The code is now fairly portable (for a Unix/Linux box running Apache.) Having a lot of mirrors brought in the problem of keeping them up-to-date. To handle this, in 2000 I set up an "rsync server" at Monash and have set "cron" scripts running at the mirror sites which periodically interrogate the Monash site and collect and install any updated files.</p> <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="whatsnew">WHAT'S NEW</a></h2> <ul> <li> added the large Warodai Japanese-Russian dictionary. <em>November 2016</em></li> <li> added the display of the multi-radical elements of a kanji Backdoor/API. <em>April 2015</em></li> <li> added the [ViewDB] option. <em>Feb 2015</em></li> <li> added the Examples regular expression search option to the Backdoor/API. <em>July 2013</em></li> <li> added the Advanced Search option. <em>October 2012</em></li> <li> tidied up the Edit/Promote tags so that people don't promote non-English entries. Added linking information to the other language dictionares. <em>August 2012</em></li> <li> added the first release of the Italian dictionary. The Dutch dictionary is now on automatic monthly update.<em>February 2012</em></li> <li> added multi-radical kanji lookups and example sentence searching to the backdoor. <li>added the Japanese WordNet as an optional dictionary. <em>May 2011</em></li> <li>revised the access to the edit database. <em>Mar 2011</em></li> <li>allowed half-width kana in the input. <em>Mar 2011</em></li> <li>added "skip names" option in text glossing. <em>Sep 2010</em> <li>added links to Richard Sears' Chinese etymology site. <em>Aug 2010</em> <li>changed over to using the JMdictDB for edits to the EDICT file. <em>Jul 2010</em> <li>changed the format of the display for "uk' (usually kana) entries. See the FAQ above. <em>Jul 2010</em> <li>extended the backdoor to include raw output as well as the Exact Match and Exact/Common combinations. Extended it to include kanji stroke-counts. <em>Mar 2010</em> <li>linking to Tatoeba for the Example sentence updates. <em>Jan 2010</em> <li>the Japanese-Slovenian dictionary. <em>Dec 2009</em> <li>the extension of the Japanese-Spanish dictionary. <em>Oct 2009</em> <li>the links to the Japanese WordNet. <em>May 2009</em> <li>the links to audio clips at JapanesePod101.com. <em>April 2009</em> <li>added colours for different kanji classes in the main dictionary display. <em>September 2008</em> <li>withdrew the old "Front Page". The Word Search page is now the default entry to the server. <em>July 2008</em> <li>the customization settings get remembered when you go to the Customize Page, making it easier to alter the settings with redoing everything. <em>July 2008</em> <li>the [G][GI][S][A] links now use the readings for links when the entry is marked "(uk)", i.e. usually in kana. <em>June 2008</em> <li>revised the page headers, layout of dictionary display. Added a lot more customization options. <em>May 2008</em> <li>added in-line selected example sentences for some entries. <em>May 2008</em> <li>added Jim Rose's Stroke Order Diagrams as an alternative to Jack Halpern's. <em>April 2008</em> <li>made the GAIDIC file accessible as a dictionary in its own right. <em>April 2008</em> <li>added the Spanish file, and reordered the dictionary files, putting the non-English ones together. <em>April 2008</em> <li>added the options to ignore either or both katakana or hiragana sequences in the Translate Words function. Also, if a "partial match" is found in a katakana string, the rest of the katakana is ignored. <em>February 2008</em> <li>extended the multi-radical kanji to include the additional 5,801 kanji in JIS X 0212. Now handles over 12,000 kanji. <em>January 2008</em> <li>added a "common words" option to the backdoor method. <em>April 2007</em> <li>tidied up the footer area of the main display pages to make follow-on activities a bit clearer. <em>February 2007</em> <li>added the option to limit the multi-radical kanji lookup to Jouyou kanji. <em>January 2007</em> <li>added the Japanese Wikipedia [W] option to the dictionary display. <em>October 2006</em> <li>new amendment/new entry form which gets the name and email address from the cookie. <em>October 2006</em> <li>added the CARDIC file. <em>October 2006</em> <li>installed a new version of the BUDDHDIC file, with over twice as many entries as the previous one. Thanks again to Chuck Muller. <em>August 2006</em> <li>added the option of turning on "no repeated translations" for text glossing via the backdoor. Also added the [A] (Eijiro) option to the dictionary display. <em>August 2006</em> <li>added the feature to search for two-word phrases using an underscore between words. <em>July 2006</em> <li>when searching using a hiragana key, any EDICT entries marked with a (P) are displayed first. <em>July 2006</em> <li>extended the backdoor to allow lookups of kanji using the kanji itself, or with kana readings. <em>June 2006</em> <li>added the expanded LAWDIC file as a dictionary in its own right, removing it from the MISCDIC file. <em>May 2006</em> <li>added the REVHENKAN file. <em>November 2005</em> <li>changed the testing of the second word in the filtering function to be case-insensitive. <em>November 2005</em> <li>converted the Multi-Radical form into a table, so that now it looks more like JWPce, etc. (Got the idea from Jim Rose's Ice Mocha.) <em>August 2005</em> <li>changed the file used for glossing text to "glossdic". The "the_lot" file is now simply a concatenation of all the files. <em>August 2005</em> <li>added the "#" option for inputting katakana keys. <em>March 2005</em> <li>added the "[Partial Match]" warning to the output from the "Translate Words in Text" function. <em>October 2004</em> <li>added the River and Water Resources Glossary file. <em>October 2004</em> <li>added handling for inflections/conjugations of about 2,000 common verbs (kana only at this stage.) <em>September 2004</em> <li>added the hints about alternative English words or spellings. <em>August 2004</em> <li>added an option to the Multi-radical search to allow users to find out which elements are in a kanji. <em>July 2004</em> <li>enabled a combined exact-match/priority word option in the Keitai EJ lookup. <em>July 2004</em> <li>added 41,000 entries to the J-FRENCH file <em>May 2004</em> <li>added the option to search the examples file. For lookups on keys containing kanji, made any "exact matches" appear first. <em>Apr 2004</em> <li>replaced the JDDICT file with the full EDICT-format version of the WaDokuJT file. <em>Mar 2004</em> <li>added the "manufdic" file to the "miscdic" and "the_lot" files. <em>Oct 2003</em> <li>added the "hidden translations" option to the text glossing function, and the gloss option to the example sentences. <em>Aug 2003 </em> <li>added a little Javascript to make the blinking text cursor appear in text boxes when a page loads. Only works with recent browsers. <em>Aug 2003 </em> <li>fiddled the romaji-kana conversion to allow ASCII periods to be used in KUN readings for kanji lookups. <em>May 2003</em> <li>added the Example sentence feedback system, and the revised EDICT format. <em>April 2003</em> <li>added the "gaidic" and "envgloss" files to MISCDIC and THE_LOT. <em>Feb/March 2003</em> <li>added the [V] verb conjugation function. <em>November 2002</em> <li>replaced the ENAMDICT version used in text glossing. The version now used has kanji names with multiple readings in a single entry. <em>October 2002</em> <li>added the cookie option to the customization system. <em>September 2002</em> <li>added UTF8 as a coding option. <em>September 2002</em> <li>added the "exact match" option. <em>August 2002</em> <li>added links from EDICT entries to the example sentences in the Tanaka corpus. <em>August 2002</em> <li>replaced the small German dictionary with a bigger one incorporating part of the WaDokuJT file. <em>August 2002</em> <li>added the Russian and Buddhism dictionary files. Extended the JIS212 handling to include the non-kanji and to use HTML entities when it can. Added the links from the Buddhism dictionary to the main DDB. <em>July 2002</em> <li>add the "@" trick from xjdic for flagging romaji input strings. <em>June 2002</em> <li>added ISO-2022-JP support for backdoor strings. Note that the code-setting for these is the same as for EUC. <em>June 2002</em> <li>replaced LIFSCIDIC with V4 of that file. <em>May 2002</em> <li>added UTF-8 coding as an option to backdoor strings to enable Mozilla, etc. to handle Javascript buttons for pages coded in UTF-8. <em>May 2002</em> <li>extended the Stroke Order Diagram handling to cover all 2,230 jouyou and jinmeiyou kanji. <em>Apr 2002</em> <li>replaced the buttons on the front page with coloured table entries, using CSS. Why? (a) it loads more quickly than the previous images, (b) easier to update, especially on mirror sites. <em>Mar 2002</em> <li> expanded the text glossing: (a) now handles many compound verbs, (b) handles words with o/go prefixes in kana. <em>Dec 2001</em> <li> added a facility to match against hiragana-only words in text. <em>Nov 2001</em> <li> added a stripped-down text translation facility suitable for I-mode devices. <em>Nov 2001</em> <li> added links to the Unicode.org database for each kanji being displayed. The Uxxxx is now a link. <em>Nov 2001</em> <li> added the links to animated stroke order diagrams for the Grades 1 to 6 kanji. <em>Aug 2001</em> <li> tightened the parsing rules for long runs of kanji to reduce mistakes; allowed trailing particles when the match is correct; included the j_places file in the "the_lot" file. <em>Aug 2001</em> <li> option to restrict search to the more common entries. <em> Aug 2001</em> <li> option to look up a displayed word in the Sanseido dictionary at the Goo server. <em>May 2001</em> <li> support for the O'Neill's Essential Kanji indices, which are now in the kanji database. <em>May 2001</em> <li> included the option to do a Google search for each displayed headword (thanks to Shaun Lawson for showing me how easy it is). <em>May 2001</em> <li> support for the Kanji Learners Dictionary codes, which are now in the kanji database. <em>April 2001</em> <li> the "stardict" file was added to miscdic and the_lot <em>Jan 2001</em> <li>A new version of the "radkfile", which drives the "Multi-radical" kanji lookup. At the same time some JIS212 images were introduced on that display which better match the elements used. <em>Jan 2001</em> <li>Redid the front page, using images for the "buttons" and adding the DoCoMo options and the new button generator as full items. <em>Jan 2001</em> <li>Installed a Japanese mirror site (finally) at the ILCAA in the Tokyo University of Foreign Studies. <em>December 2000</em> <li>A section in the documentation explaining the codes and tags. <em>November 2000</em> <li>Extended the "backdoor" method to handle (a) Japanese codes not in EUC (b) text glossing. Added the ability to handle Unicode (UCS2)-coded search keys in some circumstances. All these were done to support the various types of Javascript Taskbar Buttons. <em>November 2000</em> <li>A system in which the mirror sites are automatically updated with the latest files. Now working for all mirror sites. <em>October 2000</em> <li>Added the option to suppress all but the first of duplicated translations in the word-in-text translation function. Tightened up the removal of trailing particles for jukugo, and extended this function to gairaigo. <em>August 2000</em> <li>Converted all the JIS212 images to PNG format to avoid violating the Unisys patent over GIF formats. <em>June 2000</em></li> <li>Added the linking to the jeKai Project. <em>June 2000</em></li> <li>Split the ENAMDICT entries in the THE_LOT file into two priority sets to help the choice of the more appropriate version when there are multiple readings of a name. (Now superseded.)<em>June 2000</em></li> <li>Revised the TITLE headings on pages to make them different. This is to help book-marking the main entry pages. <em>May 2000</em> <li>Added special stripped-down starting pages tailored to the microbrowsers used in the NTT DoCoMo mobile phones. These pages turn on Shift-JIS operating, and invoke an internal "docomo" mode which limits the amount of detail in the resulting display. <em>(Apr 2000)</em></li> <li>Added the option of outputting in Shift-JIS as well as the default EUC. (Did I hear you ask why? Well the NTT DoCoMo phones won't hack EUC pages, and some people want to use WWWJDIC on them.) <em>(Mar 2000)</em></li> <li>Added the option to break on end-of-line characters when glossing text. <em>(Mar 2000)</em></li> <li>Changed the front page to a slightly more modern-looking set of buttons. Added Silas Brown's "access" bit-map server as an option. <em>(Oct-Nov 1999)</em></li> <li>tidied up the Text Translation feature, eliminating line breaks, tabs, etc. from the text, and putting in a go-back-to-the start. Extended the "the_lot" file by marking out the 15,000 most important entries. <em>(Sep 1999)</em></li> <li>reformatting the displays to make the follow-on actions a bit more logical. Adding support for the De Roo codes. Restructuring the site-specific aspects to facilitate setting up mirrors. <em>(Feb 1999)</em></li> <li>enabling multiple kanji to examined at once via pasting them into the request line. <em>(October 1998)</em></li> <li>enabling the kanji-selection to be limited to Jouyou & Jinmeiyou kanji. <em> (October 1998)</em></li> <li>enabling the retention of non-Japanese text in the WWW-page word translation feature. <em>(September 1998)</em></li> <li>detection of Shift-JIS in cut-and-paste text, and its conversion to EUC. (Was not reliable for short text, so was changed to a user option.)<em>(August 1998)</em></li> <li>the creation of the THE_LOT combination dictionary file, and its setting as the default for text and WWW page glossing, and the incorporation of the LAWDIC file into the MISCDIC file. Fine-tuning the glossing function to favour some subfiles. <em>(August 1998)</em></li> <li>the extension of the jukugo translation function to operate on specified WWW pages. <em>(July 1998)</em></li> <li>addition of the function to translate jukugo, etc. from a slab of Japanese text. <em>(July 1998)</em></li> <li>addition of the ability to repeat a search in different dictionaries. <em>(June 1998)</em></li> <li>expansion of the kanji database to include itaiji cross-reference information and SKIP codes in the JIS X 0212 kanji. <em>(May 1998)</em></li> <li>expansion of the display of the XJnnnnn itaiji cross-reference information in KANJIDIC/KANJD212 to include a link to the variant, and the display of each variant. <em>(May 1998)</em></li> <li>inclusion of the J_PLACES file. <em>(Apr 1998)</em></li> <li>support for the index numbers from the New Nelson dictionary. These are now an option on the Kanji Selection screen. <em>(Feb 1998)</em></li> <li>the three initial entry screens (from the front-page) can now be saved as book-marks. <em>(Dec 1997)</em></li> <li>the inclusion of the classic Four Corner index on the Kanji page, and at the same time added links to pages describing the Four Corner & SKIP codes. <em>(4 Dec 1997)</em></li> <li>the addition of Timothy Huang's Big5 Database information to the kanji-level links. <em>(17 November 1997)</em></li> <li>the unification of the KANJD212 file into the kanji database now used by the server. The KANJD212 file is no longer treated as just another dictionary file. Display of the JIS X 0212 kanji is done by in-line GIF images, as very few browsers support this standard. <em>(22 Oct 1997)</em></li> <li>links to Christian Wittern's KanjiBase character database at the University of Goettingen in Germany. <em>(19 Sep 1997)</em></li> <li>a direct URL access (no POST) to enable cross-linking from other WWW dictionaries, etc. Email me if you want details. <em>(12 Sep 1997)</em></li> <li>the inclusion of my KANJD212 file as one of the dictionaries. <em>(12 Sep 1997)</em></li> <li>a system of links to other WWW dictionaries. The first to go live are Chuck Muller's WWW CJK dictionary and Rick Harbaugh's Chinese Character Genealogy Dictionary. You can link to them from the kanji display page, and see their information about the selected kanji. <em>(12 Sep 1997)</em></li> <li>the support for a second "word" in English keyword searches. This word is used as a filter, and is case-sensitive, however it can occur within a longer word. Try looking for "home stay" or "treasure house" to see how it works. <em> (11 Sep 1997)</em> (BTW, this works in Japanese searches too!)</li> <li>user customization of screen parameters, colours, and input coding. <em> (8 Sep 1997)</em></li> </ul> <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="planimp_tag">PLANNED IMPROVEMENTS</a></h2> <ul> <li>Allowing for "relaxed" romaji spelling, blurring the various ambiguities such as writing "ji" and "zu", vowel lengths, etc.</li> <li>Improving the front-end by a judicious use of frames and, perhaps, Javascript. (If I do this it will be an option.)</li> </ul> <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="bugs_tag">KNOWN BUGS & PROBLEMS</a></h2> <ul> <li>entering Japanese text using the IE browser on a Mac with the Kotoeri IME sometimes results in mangled kana. It seems that other browsers such as Mozilla and Safari are fine. No solution is suggested, other than to try another browser. <li>WWWJDIC can sometimes be made to crash by sending very long strings into the text-glossing function via the backdoor (URL) method. It is due to something being overwritten, and is platform dependent. I suspect an undersized environment variable is the problem. Try a smaller amount of text if this happens. <li>WWWJDIC occasionally crashes, producing a "core" dump. This occurs about once every month, i.e. in a minute proportion of accesses. The user will probably be sent an "internal error" message. I am curious to track down the cause of these crashes, so if one occurs while you are using it, please email me on: jimbreen(at)gmail.com with the details.</li> <li>If you choose a compound from the display to look at the kanji within it, and at the same time change dictionaries, it tries to get the compound from the new dictionary, with unpredictable results. (I might not fix this; more a feature than a bug.)</li> <li>If you combine a two-word search with the common-word restriction, it stops working after the first page of results is displayed.</li> <li>If you do a lookup using a kanji or English word which occurs more than once in an entry, the entry may get displayed more than once. This is because there is more than one index item pointing at the entry. The server will stop multiple displays on the one page, but can't detect them when they are spread across several pages. It is a bother, but fixing it would be very complicated in a stateless server. </ul> <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="browser_tag">BROWSING IN JAPANESE</a></h2> <p>As WWWJDIC provides no support for the display of Japanese words in a romanized form (Romaji), you will require a browser capable of displaying Japanese kana and kanji. In the past this was something of a challenge, but virtually all modern browsers have this functionality. <p> <font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="techbits_tag">TECHNICAL BITS</a></h2> <h3>Structure</h3> WWWJDIC is a single C program which takes its parameters from the URL (QUERY_STRING) and from the various buttons (POST method). It carries as much as it can of the user's state by loading the values of the various radio/checkboxes. View the source of some of the screens if you want to see how the CGI stuff is working. (NB: As mentioned above, it uses the POST method for receiving parameters from the browser; not the GET method. Some WWW query systems can only use the GET method, and thus will not currently work with WWWJDIC.)<br> <p> No database system is used. Each dictionary file is a single text file with a dictionary entry per line. Associated with each text file is a sorted index file containing pointers to each word or token in an entry. A binary search is used to find an entry/entries which contain the desired word, making the dictionary lookup extremely fast and efficient. The examples file is handled in a similar fashion, except a quasi-dictionary is used which has pointers to the sentences which contain particular words. (This method of dictionary indexing was introduced in 1990 in the original DOS "JDIC" program, and is also used in the <tt>xjdic</tt> program.) <p> The program runs under the Apache server and on a number of different Unix-like operating systems, including Solaris, AIX, FreeBSD and several Linux distributions. No attempt has been made to run it under Windows. <p> I originally planned to have a permanent dictionary search engine, with CGI programs calling it, as happens with Jeffrey's dictionary server. In the end I did not go ahead with this, as memory-mapped handling of the read-only dictionary files, and the significant caching carried out by the file system, achieves the same efficiency goal anyway. <p> <h3>Japanese Character Codes</h3> WWWJDIC uses the EUC-JP coding for all its files and all internal processing. EUC-JP is also the default coding for the HTML it generates. <p> The characters encoded in the files are from the JIS X 0208 character set which contains the Japanese kana and most common 6,355 kanji along with the Russian and Greek sets, plus the JIS X 0212 character set which includes a further 5,801 kanji plus some Latin characters with diacritics (acute, grave, umlaut, etc.) <p> When pages are displayed using the EUC-JP or Shift_JIS encodings, characters from JIS X 0212 are displayed either as HTML entities or as 16x16 bit-mapped images. If the optional UTF-8 coding is used, all characters are displayed in that coding. <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="repbugs_tag">BUG REPORTS</a></h2> Reports of errors in the server software or configuration, or in the dictionary , etc. files are most welcome. The best ways to report these are: <ol type="a"> <li> for errors in the dictionary files, use the "[Edit]" link after each entry display. <li> for errors in the example sentences (Tanaka corpus) use the "Send Comments/Correction" option on the page of displayed example. <li> for other errors, e.g. server malfunction, email me (Jim) at <a href="mailto:jimbreen@gmail.com">mailto:jimbreen@gmail.com</a>. </ol> <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="mirror_tag">MIRROR SITES</a></h2> Mirror sites stay up-to-date by connecting to the master site at Monash once each day, retrieving a manifest file, then retrieving any updated source or data files. The file retrieval is done using the <a href="http://rsync.samba.org/rsync/">rsync</a> system, which is excellent for retrieving small portions of large files. (There is an anonymous rsync server running at Monash for this purpose.) According to the settings in the manifest file, modified source files are compiled, index files are generated, etc. as part of this daily update. <p> I get a number of enquiries from people offering to host mirrors. I am not actively seeking many more mirrors, however I like to have a reasonable geographic spread. The basic requirements for a mirror site are: <ol type="a"> <li>I must have an account on the system. Installation is complicated and not well documented. <li> it must be a permanent arrangement, or at least one capable of being used for several years. I don't want to go to trouble setting it up only to have it withdrawn. <li> it must be a Unix-like operating system (Solaris, Linux, AIX, etc.) It would take a major rewrite to get it to work in Microsoft's ASP, and I have no motivation to do that. <li> it must have an Apache server running, plus a full suite of utility software, including gcc, wget, lynx, rsync, etc. <li> it must be very well connected to the Internet. Having a poorly connected mirror is a waste of time. <li> about 200Mb of disk space is required to hold the data and program files. The mirror will operate satisfactorily on a system with 256Mb of RAM and 512Mb of swap space, however more is better, especially if other systems are sharing the server. The CPU load is relatively small, however a faster processor will reduce the time spent indexing the dictionary files during the daily update. </ol> Note, I don't provide mirrors for individuals. Setting up and maintaining a mirror takes quite a lot of my time. <p> <h3>Personal Mirrors</h3> I get a lot of requests from people wanting to have a mirror on their own machine for local off-line use. At present I have to say <b>"no"</b>. The code and data files are reasonably complex and quite undocumented. I simply do not have the time or energy to write installation and maintenance documentation, or to answer the inevitable questions that would arise. <p> Also there is the issue of quality control. I make several changes to either the code or data every week. I can't guarantee personal mirrors would stay in step with all this, and I hate getting emails about things I have already fixed. <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="backdoor_tag">BACKDOOR ENTRY/API</a></h2> If you want interface to WWWJDIC from another page or a CGI program, there is a "backdoor" or API (application program interface) entry which enables simple searches to be initiated via the URL QUERY_STRING. To use this, you must use the URL associated with the WWWJDIC cgi program, with the "backdoor" code set. The format is: <ul> <li><tt> https://www.edrdg.org/cgi-bin/wwwjdic/wwwjdic?nMtkxxxxxx</tt> (or <tt>nZtkxxxxxx</tt> for raw output)<br> (or its equivalent on a mirror. Note that some sites use "http" rather than "https", and some require a ".cgi" suffix)</li> </ul> where: <ul> <li>n = dictionary to use (1 = EDICT, 3 = ENAMDICT, etc. Examine the source of one of the pages to get the full list of codes.)</li> <li>M = backdoor entry (regular display) or Z = backdoor entry (raw dictionary display)</li> <li>t is the search type: <ul> <li> for dictionary lookups use: <ul> <li> D where the lookup text is in ASCII, EUC, ISO-2022-JP or UCS (the old uxxxx format); <li> S where the lookup text is in Shift-JIS; <li> U where the lookup text is in UTF-8. </ul> <li> for kanji lookups use: <ul> <li> K for all lookups via codes, codepoints, etc. or where the text string (e.g. kanji or reading) is in EUC; <li> M for all lookups using a text string in UTF8; <li> N for all lookups using a text string in Shift_JIS; </ul> <li> for text glossing/translate words in text use: <ul> <li> G where the text is in EUC, ISO-2022-JP or UCS; <li> H where the text is in Shift-JIS; <li> I where the text is in UTF-8. </ul> <li> for multi-radical kanji lookups use: <ul> <li> B where the text is in EUC, ISO-2022-JP or UCS; <li> C where the text is in Shift-JIS; <li> F where the text is in UTF-8. </ul> <li> for example sentence lookups via indexed Japanese words, use E. <li> for example sentence lookups using a regular expression, use T. <li> for seeing the radicals used to index a kanji, use R. </ul> <li>k is the key type: <ul> <li> for dictionary lookups <ul> <li> for English keys use E, or P to get just "common words", Q to get an "exact match" and R to get both; <li> for Japanese keys which begin a term use J (this is mandatory for romaji keys) and P to get just "common words" (doesn't work with romaji); for keys which start with kanji anywhere in a term use M, and N to get just common words; and for single-kanji keys use K for keys in the first position and L for kanji keys in any position. </ul> <li> for kanji lookups, use M followed by the KANJIDIC letter codes (B, U, V, N, etc.) or J if a reading or kanji is being provided. An optional stroke-count or stroke-count range can be included by placing it between "=" characters; <li> for text glossing use G, or H to turn on the "no repeated translations" option. <li> for multi-radical kanji lookups use J for jouyou kanji-only, H to include JIS X 0212 kanji, and X for anything else. An optional stroke-count or stroke-count range can be included by placing it between "=" characters. <li> for example sentence lookups with indexed words use E for EUC, ISO-2022-JP or UCS, S for Shift_JIS and U for UTF-8, followed by "lookupword=n=kana=". The kana is optional and is there to disambiguate between different headwords. For "n", 1 => random selection of 10, anything else => display up to 100 sentences starting at n. <li> for example sentence lookups using a regular expression, use E for EUC, ISO-2022-JP or UCS, S for Shift_JIS and U for UTF-8, followed by the search string. Up to 99 example sentences may be displayed. <li> for displaying the radical elements of a kanji, use E for EUC, ISO-2022-JP or UCS, S for Shift_JIS and U for UTF-8, followed by the kanji itself. </ul> <li>xxxxxx is the search key or text itself.</li> </ul> <em>(Note that Japanese text has to be in the URL-escape coding with each byte as %xx.)</em> <p> <b>Examples</b> <p><tt>1MKU4ed8</tt> - look up the kanji with the Unicode codepoint "4ed8"</p> <p><tt>1MMJ%E4%BB%98</tt> - look up the kanji with the UTF8 code of "E4BB98".</p> <p><tt>1MMMB140=6-7=</tt> - look up the kanji with the Bushu code of "140" and with 6 and 7 strokes</p> <p><tt>4MDJkoujou</tt> - look up the Japanese word "koujou" (romanized) in dictionary 4.</p> <p><tt>1MDJ%C0%E8%C0%B8</tt> - look up the Japanese word "sensei" (in kanji) using EUC-JP coding.<p> <p><tt>1MSJ%90%E6%90%B6</tt> - as above, but in Shift_JIS.</p> <p><tt>1ZUJ%E5%85%88%E7%94%9F</tt> - as above, but in UTF-8 and producing a "raw output" display.</p> <p><tt>1MDErabbit</tt> - look up the word "rabbit" in EDICT</p> <p><tt>9MGG%xx%xx%xx%xx%xx%xx%xx</tt> - gloss the (EUC) text</p> <p>Also, if you want to change the colour, numbers of line per page, etc. you can also add the URL customization parameters at the end of the URL string, e.g.: <p><tt>1MDEhorse_2_25_5_pink</tt> - look up "horse" and return the results on a pink page in Shift_JIS with 25 lines/page.<p> <p>Note that if you want to use this method with other sites, you will need to modify the URL accordingly.</p> <p>The "raw" dictionary display option is intended for calls from other programs, smartphones, etc. It omits all header and footer information from the pages, and displays the unedited dictionary entries in EDICT and KANJIDIC format, one-per-line and encapsulated by <pre> ... </pre>. In this option the output is always in UTF-8 coding. <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="feed_tag">FEEDBACK</a></h2> If you want to ask questions about WWWJDIC or provide some feedback: <ul> <li>email the author, Jim Breen, at "jimbreen (at) gmail dot com"; <li> join the <a href="https://groups.yahoo.com/neo/groups/edict-jmdict/info">mailing-list/group</a> and make it a more open discussion. </ul> <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="don_tag">DONATIONS</a></h2> Several kind people have asked how if they can make donations to the WWWJDIC project, including the EDICT, ENAMDICT, etc. dictionary files. Well yes, they can. The project is part of the <a href="http://www.edrdg.org/"> Electronic Dictionary Research and Development Group</a>, and donations help fund the ongoing development of the dictionaries and software. Also as the home site of WWWJDIC, EDICT, etc. on a commercial site (Jim has now retired from Monash), it is great to have it self-funding and not have to rely on things like advertising. <p>If you are inclined to make a donation it would be most welcome. There are two ways of donating: <ul> <li> make a donation via PayPal using a credit or debit card. Simply click on the following button and follow the instructions. <form action="https://www.paypal.com/cgi-bin/webscr" method="post"> <input type="hidden" name="cmd" value="_xclick"> <input type="hidden" name="business" value="Jim.Breen@infotech.monash.edu.au"> <input type="hidden" name="item_name" value="Electronic Dictionary Research Group"> <input type="hidden" name="item_number" value="EDRGDON001"> <input type="hidden" name="no_shipping" value="2"> <input type="hidden" name="no_note" value="1"> <input type="hidden" name="currency_code" value="AUD"> <input type="hidden" name="tax" value="0"> <input type="hidden" name="bn" value="PP-DonationsBF"> <input type="image" src="https://www.paypal.com/en_US/i/btn/x-click-but21.gif" name="submit" alt="Make payments with PayPal - it's fast, free and secure!"> <img alt="" border="0" src="https://www.paypal.com/en_AU/i/scr/pixel.gif" width="1" height="1"> </form> <li> take out a 1-year subscription to WWWJDIC. This is just the same as making a donation, but the documentation from PayPal may assist people who use WWWJDIC in their business activities to claim it as a tax deduction. Again click on the following button and follow the instructions. <form action="https://www.paypal.com/cgi-bin/webscr" method="post"> <input type="image" src="https://www.paypal.com/en_AU/i/btn/btn_subscribe_LG.gif" name="submit" alt="PayPal - The safer, easier way to pay online."> <img alt="" border="0" src="https://www.paypal.com/en_AU/i/scr/pixel.gif" width="1" height="1"> <input type="hidden" name="cmd" value="_xclick-subscriptions"> <input type="hidden" name="business" value="jimbreen@gmail.com"> <input type="hidden" name="item_name" value="WWWJDIC Japanese Dictionary Service"> <input type="hidden" name="no_shipping" value="1"> <input type="hidden" name="no_note" value="1"> <input type="hidden" name="currency_code" value="AUD"> <input type="hidden" name="lc" value="AU"> <input type="hidden" name="bn" value="PP-SubscriptionsBF"> <!-- <input type="hidden" name="a3" value="50.00"> --> <input type="hidden" name="p3" value="1"> <input type="hidden" name="t3" value="Y"> <input type="hidden" name="sra" value="1"> Enter the amount you would like to pay for the subscription: <input type="text" name="a3" value="100.00"> </form> </ul> <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="disc_tag">DISCLAIMER</a></h2> The WWWJDIC server uses dictionary files from a wide variety of sources. Some of these files have been compiled and edited by Jim Breen and others associated with the JMdict/EDICT project, and while every effort has been made to ensure their accuracy, there are sure to be some errors. Other files have come from external sources and are of varying qualities.<p> Monash University and other providers of the WWWJDIC server make NO WARRANTY as to the accuracy of the information provided by the servers and advise users that any use of the servers is ENTIRELY at their own risk. <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <h2><a name="ack_tag">ACKNOWLEDGEMENTS</a></h2> I want to record my thanks to a few of the key people who have helped with the server. E&OE. <ul> <li>Otfried Schwarzkopf, who showed me initially that it could be done; <li>Jeffrey Friedl, whose excellent and popular server inspired me get started; <li>Jamie Scuglia, then our School's sysdmin at Monash, who was most helpful and supportive in getting it going initially;</li> <li>Chuck Musciano and Bill Kennedy, authors of the O'Reilly HTML book, from which I taught myself about HTML;</li> <li> the good people who made the present and past mirrors available, including: Kendon Stubbs and Susan Munson (UofV), William Maton (Canada), Masayuki Toyoshima (Japan), Warren Togami (Hawaii), Jacek Rutkowski (Poland and EU), Jens and Ola (Sweden), Folken Lacour de Fanel (Chile), Sarwono Sutikno and Mr Waskita (Indonesia);</li> <li> Brodie Thiesfield, who showed me how to redo the front page using CSS. <li>Shoji Yamazaki and Bart Mathias, who gave me a lot of feedback and help on the verb conjugations;</li> <li>Paul Blay, muchan and Kouji Ueshiba, who did much of the translation of the Japanese interface option. </li> <li> Peter Galante, Charles Abbott and all the people at JapanesePod101.com who have made the inclusion of audio clips possible;</li> <li> the many people who have emailed suggestions and messages of support. </ul> <br><font size="-1"><a href="#top_tag">[Return to the top]</a></font> <hr> Go to <a href="http://nihongo.monash.edu/japanese.html"> Jim Breen</a>'s Japanese Page. </body> </html>