|
DATE
|
TOPIC
|
MATERIALS
|
|
11/11
|
What is a corpus? |
- Slides [I.2] - [CL] - Chapter 1 |
|
11/11
|
Corpus representativeness and annotations |
- Slides [I.2] - [CL] - Chapters 2 and 3 |
|
14/11
|
Concordances, collocations and measures of words association |
- Slides [I.2] - [CL] - Chapters 2 and 3 |
|
14/11
|
Regular Expressions |
- Slides [I.3] - [SLP] - Chapter 2 - Reg. Exp. Quick Start. - Reg.Exp. Demo. |
|
17/11
|
Corpus typology and design | - [Atkins et al. 1992] |
18/11
|
Tokenisation and Sentence segmentation. |
- Slides [I.4], - [Schmid, 2008] |
18/11
|
Text Character Encoding. |
- Slides [I.5], |
|
21/11
|
Techniques for Text Retrieval Classic Information Retrieval |
- Slides [I.2], [I.6], [I.7] |
|
24/11
|
Corpus Querying with AntConc and Qwick. |
- AntCont website (Local Copy) - English/Arabic Demo Corpora - Qwick instructions |
|
25/11
28/11 |
Techniques for annotating texts with XML. Building and using a small, annotated XML corpus. |
- Slides [I.8] - XAIRA Documentation (PTB PoS-tags, XAIRA Installer, Demo XML files) |
|
1/12
|
Case study: - A review of Written and Spoken corpora (English/Italian) - Corpora@FICLIT: CORIS/CODIS, BoLC e DiaCORIS. |
- Link1 (English), Link2 (Various languages) - Slides [I.9] |
REFERENCES
[CL]