Computer ScienceScience & MathematicsEconomics & FinanceBusiness & ManagementPolitics & GovernmentHistoryPhilosophy

The Unicode Cookbook for Linguists

Managing writing systems using orthography profiles

by Steven Moran, Michael Cysouw

The Unicode Cookbook for Linguists

Subscribe to new books via dBooks.org telegram channel

Join
DescriptionTable of ContentsDetailsHashtagsReport an issue

Book Description

This text is a practical guide for linguists, and programmers, who work with data in multilingual computational environments. We introduce the basic concepts needed to understand how writing systems and character encodings function, and how they work together at the intersection between the Unicode Standard and the International Phonetic Alphabet. Although these standards are often met with frustration by users, they nevertheless provide language researchers and programmers with a consistent computational architecture needed to process, publish and analyze lexical data from the world's languages. Thus we bring to light common, but not always transparent, pitfalls which researchers face when working with Unicode and IPA. Having identified and overcome these pitfalls involved in making writing systems and character encodings syntactically and semantically interoperable (to the extent that they can be), we created a suite of open-source Python and R tools to work with languages using orthography profiles that describe author- or document-specific orthographic conventions. In this cookbook we describe a formal specification of orthography profiles and provide recipes using open source tools to show how users can segment text, analyze it, identify errors, and to transform it into different written forms for comparative linguistics research.

This open book is licensed under a Creative Commons License (CC BY). You can download The Unicode Cookbook for Linguists ebook for free in PDF format (1.0 MB).

Table of Contents

Chapter 1
Writing systems
Chapter 2
The Unicode approach
Chapter 3
Unicode pitfalls
Chapter 4
The International Phonetic Alphabet
Chapter 5
IPA meets Unicode
Chapter 6
Practical recommendations
Chapter 7
Orthography profiles
Chapter 8
Implementation

Book Details

Title
The Unicode Cookbook for Linguists
Publisher
Language Science Press
Published
2018
Pages
148
Edition
1
Language
English
ISBN13
9783961100910
ISBN10
3961100918
ISBN13 Digital
9783961100903
ISBN10 Digital
396110090X
PDF Size
1.0 MB
License
CC BY

Related Books

The Form of Ideology and the Ideology of Form
This timely volume focuses on the period of decolonization and the Cold War as the backdrop to the emergence of new and diverse literary aesthetics that accompanied anti-imperialist commitments and Afro-Asian solidarity. Competing internationalist frameworks produced a flurry of writings that made Asian, African and other world literatures visible ...
Graph Databases For Beginners
Whether you're a business executive or a seasoned developer, something has led you on the quest to learn more about graphs - and what they can do for you. This ebook will take those new to the world of graphs through the basics of graph technology, including: Using the intuitive Cypher query language; The importance of data relationships; K...
YOUMARES 8 – Oceans Across Boundaries: Learning from each other
This book presents the proceedings volume of the YOUMARES 8 conference, which took place in Kiel, Germany, in September 2017, supported by the German Association for Marine Sciences (DGM). The YOUMARES conference series is entirely bottom-up organized by and for YOUng MARine RESearchers. Qualified early career scientists moderated the scientific se...
United Nations Peace Operations in a Changing Global Order
This open access volume explores how UN peace operations are adapting to four trends in the changing global order: (1) the rebalancing of relations between states of the global North and the global South; (2) the rise of regional organisations as providers of peace; (3) the rise of violent extremism and fundamentalist non-state actors; and (4) incr...
Snow Sports Trauma and Safety
This book covers the latest in snow sport epidemiology, snow sport injuries and treatment, and biomechanical/mechanical engineering related to snow sports injuries (mechanisms of injury, injury prevention by equipment design, injury prevention by design of resort features, and more). It brings together a collection of papers from the International ...
Android on x86
Android on x86: an Introduction to Optimizing for Intel® Architecture serves two main purposes. First, it makes the case for adapting your applications onto Intel's x86 architecture, including discussions of the business potential, the changing landscape of the Android marketplace, and the unique challenges and opportunities that arise from x...