DATE
|
TOPIC
|
MATERIALS
|
14/11
|
What is a corpus? |
- Slides [I.2] - [CL] - Chapter 1 |
14/11
|
Corpus representativeness and annotations |
- Slides [I.2] - [CL] - Chapters 2 and 3 |
14/11
|
Concordances, collocations and measures of words association |
- Slides [I.2] - [CL] - Chapters 2 and 3 |
15/11
|
Regular Expressions |
- Slides [I.4] - [SLP] - Chapter 2 - Reg. Exp. Quick Start. - Reg.Exp. Demo. |
15/11
|
Tokenisation and Sentence segmentation. |
- Slides [I.3], - [Schmid, 2008] |
18/11
|
Text Character Encoding. |
- Slides [I.3b], |
18/11
21/11 |
Techniques for Text Retrieval Search Engines Indexing |
- Slides [I.2], [I.2b] |
21/11
|
Corpus Querying with AntConc and Qwick. |
- AntCont website (Local Copy) - Qwick instructions - English Demo Corpus |
22/11
25/11 |
Techniques for annotating texts with XML. Building and using a small, annotated XML corpus. |
- Slides [I.5] - XAIRA Documentation (PTB PoS-tags, XAIRA Installer, Demo XML files) |
28/11
|
Case study: - A review of Written and Spoken corpora (English/Italian) - Corpora@FICLIT: CORIS/CODIS, BoLC e DiaCORIS. |
- Link1 (English), Link2 (Various languages) - Slides [I.6] |
- | Laboratory session. |
REFERENCES
[CL]