DATE
|
TOPIC
|
MATERIALS
|
13/11
|
What is a corpus? |
- Slides [I.2] - [CL] - Chapter 1 |
13/11
|
Corpus representativeness and annotations |
- Slides [I.2] - [CL] - Chapters 2 and 3 |
14/11
|
Concordances, collocations and measures of words association |
- Slides [I.2] - [CL] - Chapters 2 and 3 |
14/11
|
Regular Expressions |
- Slides [I.3] - [SLP] - Chapter 2 - Reg. Exp. Quick Start. - Reg.Exp. Demo. |
17/11
|
Tokenisation and Sentence segmentation. |
- Slides [I.4], - [Schmid, 2008] |
17/11
|
Text Character Encoding. |
- Slides [I.5], |
20/11
27/11 |
Techniques for Text Retrieval Search Engines Indexing Classic Information Retrieval |
- Slides [I.2], [I.6], [I.7] |
21/11
|
Corpus Querying with AntConc and Qwick. |
- AntCont website (Local Copy) - English Demo Corpus - Qwick instructions |
27/11
28/11 |
Techniques for annotating texts with XML. Building and using a small, annotated XML corpus. |
- Slides [I.8] - XAIRA Documentation (PTB PoS-tags, XAIRA Installer, Demo XML files) |
4/12
|
Case study: - A review of Written and Spoken corpora (English/Italian) - Corpora@FICLIT: CORIS/CODIS, BoLC e DiaCORIS. |
- Link1 (English), Link2 (Various languages) - Slides [I.9] |
- | Laboratory session. |
REFERENCES
[CL]