AILLA is a digital archive of recordings and texts in and about the indigenous languages of Latin America. Access to archive resources is free of charge. Most of the resources in the AILLA database are available to the public, but some have special access restrictions.
The heart of the collection is recordings of naturally-occurring discourse in a wide range of genres, including narratives, ceremonies, oratory, conversations, and songs. Many of these recordings are accompanied by transcriptions and translations in either Spanish, English, or Portuguese. These works contain a wealth of information about Latin American indigenous cultures as well as knowledge about the natural environments that the people live in.
The archive also collects materials about these languages, such as grammars, dictionaries, ethnographies, and research notes. The collection includes teaching materials for bilingual education and language revitalization programs in indigenous communities, such as primers, readers, and textbooks on a variety of subjects, written in indigenous languages
The Australian Institute of Aboriginal and Torres Strait Islander Studies (AIATSIS) holds computer-based (digital) materials about Australian Indigenous languages in the Aboriginal Studies Electronic Data Archive (ASEDA). ASEDA has materials including dictionaries, grammars, teaching materials, and represents about 300 languages. ASEDA offers a free service of secure storage, maintenance, and distribution of electronic texts relating to these languages.
This collection contains linguistic data, stories, songs/chants and other material in about 90 languages; mostly endangered or rare languages. The largest group of languages represented is the result of field work sponsored by the Survey of California and Other Indian Languages (UCB, Department of Linguistics).
Cascadilla Proceedings Project is an imprint of Cascadilla Press. We created CPP as a new model for proceedings of linguistics conferences and workshops. All proceedings published by CPP are available both in print and on the web. Web access is free and unrestricted, and the copy available on the web is the same as the book version in content, formatting, and pagination. The print edition is a hardback which meets library binding standards. This combination allows for the best of both worlds: free and quick access for researchers looking for a proceedings paper, with all the advantages of being published in book form.
Type in a word or phrase in one of seven languages (English, French, German, Spanish, Hebrew, Russian, Chinese) and see how its usage frequency has been changing throughout the past few centuries. Addictive.
ELSNET is a Europe-based forum dedicated to human language technologies. It operates in an international context, and will consider, across discipline boundaries, all human communication research areas related to language and speech.
Its main objective is to advance human language technologies in a broad sense by bringing together Europe's key players in research, development, integration or deployment in the field of language and speech technology and neighbouring areas.
This page describes archives which house materials that are intended to document and describe human languages, such as wordlists, lexicons, annotated signals, interlinear texts, paradigms, field notes, and linguistic descriptions.
The LINGUIST List is dedicated to providing information on language and language analysis, and to providing the discipline of linguistics with the infrastructure necessary to function in the digital world. LINGUIST maintains a web-site with over 2000 pages and runs a mailing list with over 25,000 subscribers worldwide. LINGUIST also hosts searchable archives of over 100 other linguistic mailing lists and runs research projects which develop tools for the field, e.g., a peer-reviewed database of language and language-family information, and recommendations of best practice for digitizing endangered languages data.
LINGUIST is a free resource, run by linguistics professors and graduate students, and supported entirely by your donations.
The National Gallery of the Spoken Word (NGSW) is an ongoing five year research project funded under the Digital Library Initiative II
spearheaded by the National Science Foundation. The NGSW is creating an online fully-searchable digital library of spoken word collections spanning the 20th century at HistoricalVoice.org. NGSW provides storage for these digital holdings and public exhibit "space" for the most evocative collections. From Thomas Edison's first cylinder recordings and the voices of Babe Ruth and Florence Nightingale to Studs Terkel's timeless interviews and the oral arguments of the US Supreme Court, the collections of the NGSW digital library cover a variety of interests and topics.
This catalog, developed by the Open Language Archives Community (OLAC), provides access to a wealth of information about thousands of languages, including details of text collections, audio recordings, dictionaries, and software, sourced from dozens of digital and traditional archives.
Omniglot is a guide to the writing systems and languages of the world.
It also contains tips on learning languages, language-related articles, quite a large collection of useful phrases in many languages, multilingual texts, a multilingual book store and an ever-growing collection of links to language-related resources.
The speech accent archive uniformly presents a large set of speech samples from a variety of language backgrounds. Native and non-native speakers of English read the same paragraph and are carefully transcribed. The archive is used by people who wish to compare and analyze the accents of different English speakers.
WALS is a large database of structural (phonological, grammatical, lexical) properties of languages gathered from descriptive materials (such as reference grammars) by a team of more than 40 authors (many of them the leading authorities on the subject).
WALS consists of 141 maps with accompanying texts on diverse features (such as vowel inventory size, noun-genitive order, passive constructions, and "hand"/"arm" polysemy), each of which is the responsibility of a single author (or team of authors). Each map shows between 120 and 1370 languages, each language being represented by a symbol, and different symbols showing different values of the feature. Altogether 2,650 languages are shown on the maps, and more than 58,000 datapoints give information on features in particular languages.