• Santali (Ok Chiki)
    In order to train our machine learning language models, we need a text corpus (which will be a database of lots of available texts written in a special language). Finding text data in such languages can be challenging. When we can not find data online, we will discuss a list of writing prompts so we are able to produce new text corpora. (You can read about our working efforts for all these languages in one of our latest study papers.)
  • Persian

    We focus on the design design. Design for a brand new language on Gboard demands evaluation and research to match in most of the characters in a way that makes sense. We’ll analyze text corpora to find out which characters to include and to ascertain how often they are used, if there isn’t a great deal of information for the language accessible on the internet.
    How we add new languages to Gboard

    A look at the layouts below shows the sheer diversity of input methods utilized Throughout the world daily: