The Unicode Cookbook for Linguists

Managing writing systems using orthography profiles

by Steven Moran, Michael Cysouw

DescriptionTable of ContentsDetailsHashtagsReport an issue

Book Description

This text is a practical guide for linguists, and programmers, who work with data in multilingual computational environments. We introduce the basic concepts needed to understand how writing systems and character encodings function, and how they work together at the intersection between the Unicode Standard and the International Phonetic Alphabet. Although these standards are often met with frustration by users, they nevertheless provide language researchers and programmers with a consistent computational architecture needed to process, publish and analyze lexical data from the world's languages. Thus we bring to light common, but not always transparent, pitfalls which researchers face when working with Unicode and IPA. Having identified and overcome these pitfalls involved in making writing systems and character encodings syntactically and semantically interoperable (to the extent that they can be), we created a suite of open-source Python and R tools to work with languages using orthography profiles that describe author- or document-specific orthographic conventions. In this cookbook we describe a formal specification of orthography profiles and provide recipes using open source tools to show how users can segment text, analyze it, identify errors, and to transform it into different written forms for comparative linguistics research.

This open book is licensed under a Creative Commons License (CC BY). You can download The Unicode Cookbook for Linguists ebook for free in PDF format (1.0 MB).

Table of Contents

Chapter 1
Writing systems
 
Chapter 2
The Unicode approach
 
Chapter 3
Unicode pitfalls
 
Chapter 4
The International Phonetic Alphabet
 
Chapter 5
IPA meets Unicode
 
Chapter 6
Practical recommendations
 
Chapter 7
Orthography profiles
 
Chapter 8
Implementation
 

Book Details

Publisher
Language Science Press
Published
2018
Pages
148
Edition
1
Language
English
ISBN13
9783961100910
ISBN10
3961100918
ISBN13 Digital
9783961100903
ISBN10 Digital
396110090X
PDF Size
1.0 MB
License
CC BY

Related Books

Graph Databases For Beginners
Whether you're a business executive or a seasoned developer, something has led you on the quest to learn more about graphs - and what they can do for you. This ebook will take those new to the world of graphs through the basics of graph technology, including: Using the intuitive Cypher query language; The importance of data relationships; Key di...
YOUMARES 8 – Oceans Across Boundaries: Learning from each other
This book presents the proceedings volume of the YOUMARES 8 conference, which took place in Kiel, Germany, in September 2017, supported by the German Association for Marine Sciences (DGM). The YOUMARES conference series is entirely bottom-up organized by and for YOUng MARine RESearchers. Qualified early career scientists moderated the scientific se...
United Nations Peace Operations in a Changing Global Order
This open access volume explores how UN peace operations are adapting to four trends in the changing global order: (1) the rebalancing of relations between states of the global North and the global South; (2) the rise of regional organisations as providers of peace; (3) the rise of violent extremism and fundamentalist non-state actors; and (4) incr...
Snow Sports Trauma and Safety
This book covers the latest in snow sport epidemiology, snow sport injuries and treatment, and biomechanical/mechanical engineering related to snow sports injuries (mechanisms of injury, injury prevention by equipment design, injury prevention by design of resort features, and more). It brings together a collection of papers from the International ...
Android on x86
Android on x86: an Introduction to Optimizing for Intel® Architecture serves two main purposes. First, it makes the case for adapting your applications onto Intel's x86 architecture, including discussions of the business potential, the changing landscape of the Android marketplace, and the unique challenges and opportunities that arise from x86 de...
Forecasting and Assessing Risk of Individual Electricity Peaks
The overarching aim of this book is to present self-contained theory and algorithms for investigation and prediction of electric demand peaks. A cross-section of popular demand forecasting algorithms from statistics, machine learning and mathematics is presented, followed by extreme value theory techniques with examples.In order to achieve carbon t...