tokenization in Java with AIF

0 Views
NFT - Galleria
NFT - Galleria
20 Sep 2021

In the video i'm going to show how naive tokenization can be done in Java with AIF But more important we will learn how AIF actually executes tokenization itself as well as we will discuss the outstanding tasks in the tokenization modules So after watching the video you will be able to start sending your commits to the AIF project starting today AIF repo a https 3A 2F 2Fgithub com 2Fb0noI 2FAIF2 a Branch with the code from the video a https 3A 2F 2Fgithub com 2Fb0noI 2FAIF2 a tree nlp-course-m1e1-tokenization AIF issue tracker a https 3A 2F 2Fgithub com 2Fb0noI 2FAIF2 a issues Issues about the tokenization modules a https 3A 2F 2Fgithub com 2Fb0noI 2FAIF2 a issues 254 Revisit default list of token separators in PredefinedTokenSeparatorExtractor a https 3A 2F 2Fgithub com 2Fb0noI 2FAIF2 a issues 252 TokenSplitter should be renamed to Tokenizer a https 3A 2F 2Fgithub com 2Fb0noI 2FAIF2 a issues 251 Extract TokenSeparatorExtractor classes into separated package a https 3A 2F 2Fgithub com 2Fb0noI 2FAIF2 a issues 250 PredefinedTokenSeparatorExtractor should have list of the characters in the config and not hardcoded a https 3A 2F 2Fgithub com 2Fb0noI 2FAIF2 a issues 249 Find replacement for RegexpCooker If some of the issues have been closed already do not worry you still can send your pull request Of Course in such case it is not going to be merged however I will review it and will provide my feedback and comments

  • Select a category

    1

Scroll More Videos


0 Total Comments Sort By

There no comments on your videos ATM