CWS/POS/NER

Chinese word segmentation, Part-of-speech tagging and Medical named entity recognition From scratch.

Getting Started

Dependencies:

tensorflow

# training, testing and evaluation
python3 run.py

Generate files:

Evaluation.md - markdown table of evaluation result
Result/ - prediction result
FinalResult/ - Final prediction result

Structure

├── Data         => data set given by TA
│   ├── devset
│   ├── testset1
│   └── trainset
├── Evaluation   => eval scripts given by TA
|
├── CWS          => CWS model
├── POS          => POS tagging model
├── NER          => NER model
|
├── constant.py  => some global constants and variables
|
├── dataset.py   => data preprocessing
├── model.py     => high-level model API for all our model
├── evaluate.py  => high-level evaluation API
└── run.py       => the entire process

CWS
POS
NER (TODO)

Task Description

Data and scripts given by TA

Directory Structure

Data: (each has its _cws, _pos, _ner file)
- devset
- testset1
- trainset
- final
  - test2.txt - raw article
Evaluation
- pos_evaluate.py
- ner_evaluate.py

Resources

Article

Paper

Sequence Tagging

Bidirectional LSTM-CRF Models for Sequence Tagging

Chinese Word Segmentation

Tools' reference

pkuseg

ACM Digital Library - Fast online training with frequency-adaptive learning rates for Chinese word segmentation and new word detection

@inproceedings{DBLP:conf/acl/SunWL12,
author = {Xu Sun and Houfeng Wang and Wenjie Li},
title = {Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection},
booktitle = {The 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea- Volume 1: Long Papers},
pages = {253--262},
year = {2012}}

Related Tools and Libraries

CRF

tensorflow/contrib/crf
CRFsuite - A fast implementation of Conditional Random Fields (CRFs)
- chokkan/crfsuite
sklearn-crfsuite
- TeamHG-Memex/sklearn-crfsuite

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CWS/POS/NER

Getting Started

Structure

Task Description

Directory Structure

Resources

Article

Paper

Related Tools and Libraries

CRF

Model Structure

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
CWS		CWS
Data		Data
Evaluation		Evaluation
FinalResult		FinalResult
NER		NER
POS		POS
Report		Report
.gitignore		.gitignore
README.md		README.md
constant.py		constant.py
dataset.py		dataset.py
evaluate.py		evaluate.py
model.py		model.py
result.ipynb		result.ipynb
run.py		run.py
util.py		util.py

pku-nlp-forfun/CWS_POS_NER

Folders and files

Latest commit

History

Repository files navigation

CWS/POS/NER

Getting Started

Structure

Task Description

Directory Structure

Resources

Article

Paper

Related Tools and Libraries

CRF

Model Structure

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages