Summary
Linguistics
Debian Science Linguistics packages
This metapackage is part of the Debian Pure Blend "Debian Science"
and installs packages related to Linguistics.
The list to the right includes various software projects which are of some interest to the Debian Science Project. Currently, only a few of them are available as Debian packages. It is our goal, however, to include all software in Debian Science which can sensibly add to a high quality Debian Pure Blend.
For a better overview of the project's availability as a Debian package, each head row has a color code according to this scheme:
If you discover a project which looks like a good candidate for Debian Science
to you, or if you have prepared an unofficial Debian package, please do not hesitate to
send a description of that project to the Debian Science mailing list
Links to other tasks
|
Debian Science Linguistics packages
Official Debian packages with high relevance
|
Apertium
Shallow-transfer machine translation engine
|
| Versions of package apertium |
| Release | Version | Architectures |
| sid | 3.1.0-1.1 | s390,alpha,amd64,armel,hppa,hurd-i386,i386,ia64,mips,mipsel,powerpc,sparc |
| squeeze | 3.1.0-1.1 | sparc,powerpc,ia64,i386,hppa,s390,armel,amd64,mipsel,mips |
| lenny | 3.0.7+1-2~lenny2+b1 | amd64 |
| lenny | 3.0.7+1-2~lenny2 | armel,i386,sparc,hppa,ia64,s390,mips,arm,powerpc,alpha,mipsel |
| etch | 1.0.3-3 | ia64,arm,s390,mips,mipsel,amd64,sparc,hppa,alpha,i386 |
| Debtags of package apertium: |
| field | linguistics |
| role | program |
|
License: DFSG free
|
|
An open-source shallow-transfer machine translation
engine, Apertium is initially aimed at related-language pairs.
It uses finite-state transducers for lexical processing,
hidden Markov models for part-of-speech tagging, and
finite-state based chunking for structural transfer.
The system is largely based upon systems already developed by
the Transducens group at the Universitat d'Alacant, such as
interNOSTRUM (Spanish-Catalan, http://www.internostrum.com/welcome.php)
and Traductor Universia (Spanish-Portuguese,
http://traductor.universia.net).
It will be possible to use Apertium to build machine translation
systems for a variety of related-language pairs simply providing
the linguistic data needed in the right format.
|
|
|
Artha
A handy off-line thesaurus based on WordNet
|
| Versions of package artha |
| Release | Version | Architectures |
| sid | 0.9.1-1 | mips,armel,mipsel,i386,powerpc,amd64,s390,ia64,sparc,alpha,hppa |
| squeeze | 0.9.1-1 | sparc,amd64,armel,hppa,i386,ia64,mips,mipsel,powerpc,s390 |
|
License: DFSG free
|
|
Artha is a handy English thesaurus having distinct features like
look up on a global hot key press, passive notifications of a
selected text's definitions, suggestions for misspelled words, etc.
Once launched, it sits on the system tray monitoring for a pre-set
hot key combination. When some text is selected on any window and
the hotkey is pressed, it pops-up with the word looked-up. Should
the user prefer passive notifications over the app. popping-up,
this can be done by enabling the notifications option.
Artha is written from scratch in pure C using GTK+, with WordNet
as it database corpus. It may be used as an advanced replacement
for the proprietary WordWeb in GNU/Linux environments.
|
|
|
Link-grammar
Carnegie Mellon University's link grammar parser for English
|
| Versions of package link-grammar |
| Release | Version | Architectures |
| sid | 4.3.9-2 | i386,mips,mipsel,hppa,sparc,powerpc,armel,s390,amd64,ia64,alpha |
| squeeze | 4.3.9-2 | armel,mips,sparc,i386,mipsel,hppa,amd64,ia64,s390,powerpc |
| lenny | 4.2.5-1 | s390,alpha,amd64,arm,armel,hppa,i386,ia64,mips,mipsel,powerpc,sparc |
| sid | 4.2.5-1 | hurd-i386 |
| etch | 4.2.2-4etch1 | mipsel,amd64,sparc,s390,hppa,mips,alpha,arm,powerpc,ia64,i386 |
| etch-security | 4.2.2-4etch1 | sparc,alpha,mips,ia64,arm,i386,s390,hppa,amd64,powerpc,mipsel |
| Debtags of package link-grammar: |
| field | linguistics |
| interface | commandline |
| role | program |
| use | checking |
| works-with | dictionary |
|
License: DFSG free
|
|
In Selator, D. and Temperly, D. "Parsing English with a Link Grammar"
(1991), the authors defined a new formal grammatical system called a
"link grammar". A sequence of words is in the language of a link
grammar if there is a way to draw "links" between words in such a way
that the local requirements of each word are satisfied, the links do
not cross, and the words form a connected graph. The authors encoded
English grammar into such a system, and wrote this program to parse
English using this grammar.
link-grammar can be used for linguistic parsing for information
retrieval or extraction from natural language documents. It can also be
used as a grammar checker.
This package contains the user-executable binary.
|
|
|
Wordnet
electronic lexical database of English language
|
| Versions of package wordnet |
| Release | Version | Architectures |
| sid | 3.0-21 | s390,hppa,ia64,mips,i386,alpha |
| sid | 3.0-20 | powerpc,sparc,mipsel |
| sid | 3.0-18 | amd64,armel |
| squeeze | 3.0-18 | ia64,mipsel,i386,hppa,sparc,powerpc,mips,armel,amd64,s390 |
| sid | 3.0-14 | hurd-i386 |
| lenny | 3.0-13 | s390,alpha,amd64,arm,armel,hppa,i386,ia64,mips,mipsel,powerpc,sparc |
| etch | 2.1-4+etch2 | mipsel,amd64,sparc,s390,hppa,mips,alpha,arm,powerpc,ia64,i386 |
| etch-security | 2.1-4+etch2 | sparc,alpha,mips,ia64,arm,i386,s390,hppa,amd64,powerpc,mipsel |
| Debtags of package wordnet: |
| field | linguistics |
| interface | x11 |
| role | program |
| scope | application |
| uitoolkit | tk |
| use | checking |
| works-with | dictionary |
| x11 | application |
|
License: DFSG free
|
|
WordNet(C) is an on-line lexical reference system whose design is
inspired by current psycholinguistic theories of human lexical
memory. English nouns, verbs, adjectives and adverbs are organized
into synonym sets, each representing one underlying lexical
concept. Different relations link the synonym sets.
WordNet was developed by the Cognitive Science Laboratory
(http://www.cogsci.princeton.edu/) at Princeton University under the
direction of Professor George A. Miller (Principal Investigator).
WordNet is considered to be the most important resource available to
researchers in computational linguistics, text analysis, and many
related areas. Its design is inspired by current psycholinguistic and
computational theories of human lexical memory.
Binary and manpages of WordNet as well as general manpages.
|
|
No known packages available
|
Wnsqlbuilder
SQL version of WordNet 3.0
|
License: GPL
Debian package not available
|
|
WordNet SQL Builder is a Java utility to generate SQL database from
WordNet standard database as released by the WordNet Project (Princeton
University)
Features
-
Support for MySql and PostGreSQL.
-
Complete port (however, orphaned morphological forms are dropped, and
so are VerbNet/XWordNet data that cannot be linked to WordNet entries).
-
Incremental build support.
-
Retains synset index as primary key allowing easy reference to wordnet
original database
-
Includes support for WordNet 3.0
-
Includes support for WordNet 2.0 to 2.1, 2.1 to 3.0, 2.0 to 3.0 sense maps
-
Includes support for VerbNet 2.3
-
Includes support for XWordNet 2.0-1.1
-
Ready-to-use database (see wnsqldatabase package in download section) including
-
WordNet 3.0
-
WordNet 2.0 to 2.1, 2.1 to 3.0, 2.0 to 3.0 sense maps
-
VerbNet 2.3
-
XWordNet 2.0-1.1
-
British National Corpus statistical data (for commonly used-words)
|
|