DATE
|
TOPIC
|
MATERIALS
|
12/11
|
What is a corpus? |
- Slides [I.2] - [CL] - Chapter 1 |
12/11
15/11 |
Corpus representativeness and annotations |
- Slides [I.2] - [CL] - Chapters 2 and 3 |
18/11
|
Concordances, collocations and measures of words association |
- Slides [I.2] - [CL] - Chapters 2 and 3 |
18/11
|
Regular Expressions |
- Slides [I.3] - [SLP] - Chapter 2 - Reg. Exp. Quick Start. - Reg.Exp. Demo. |
19/11
|
Corpus typology and design | - [Atkins et al. 1992] |
19/11
|
Tokenisation and Sentence segmentation. |
- Slides [I.4], - [Schmid, 2008] |
-
|
Text Character Encoding. |
- Slides [I.5], |
-
|
Techniques for Text Retrieval Classic Information Retrieval |
- Slides [I.2], [I.6], [I.7] |
-
|
Corpus Querying with AntConc and Qwick. |
- AntCont website (Local Copy) - English Demo Corpus - Qwick instructions |
-
|
Techniques for annotating texts with XML. Building and using a small, annotated XML corpus. |
- Slides [I.8] - XAIRA Documentation (PTB PoS-tags, XAIRA Installer, Demo XML files) |
-
|
Case study: - A review of Written and Spoken corpora (English/Italian) - Corpora@FICLIT: CORIS/CODIS, BoLC e DiaCORIS. |
- Link1 (English), Link2 (Various languages) - Slides [I.9] |
- | Laboratory session. |
REFERENCES
[CL]