Syllabus
- Part I: Techniques for corpus creation and management
- Corpora and their construction: representativeness
- Annotations and querying. Web as a corpus.
- Concordances, collocations and measures of words association
- Tokenisation and sentence splitting.
- Methods for Text Retrieval.
- Regular Expressions.
- XML corpora.
- Corpus querying packages.
- Case studies:
- Written ans spoken corpora (Italian/English): a review.
- Corpora@FICLIT: CORIS/CODIS, BoLC and DiaCORIS.