IST 384 – Natural Language Processing

 

Professor: Gondy Leroy

Classroom & Time: Wednesday, 4.00 – 6.50 pm, ACB 222

Office Hours: By appointment 

 

Notes: On May 07, we meet in Harper 61.

 

Objectives of the course

Most advanced information and knowledge management systems today need to provide access to text in some way. These systems range from search engines that map user keywords to documents to automated text summarization systems and text mining toolkits.

In this class, we will cover the linguistics topics that you need to understand to work on projects and systems that include text. We will apply this knowledge to information systems and technology.

You will focus on one particular problem for a specific domain. Each week, you will apply what you learned in class to your problem and so work towards a solution. Upon successful completion of this class you will have sufficient understanding of linguistics to understand how existing systems work and to suggest improvements. You will also have worked with open source code and have applied both knowledge and tools to a current, cutting-edge problem in IS. Finally, after completing this class, you will be in a position to tackle many more problems and opportunities related to NLP for IS. 

 

Textbook

Required:

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition

by Daniel Jurafsky (Author), James H. Martin (Author)

Publisher: Prentice Hall; 1st edition (January 26, 2000)

ISBN-10: 0130950696

ISBN-13: 978-0130950697

Additional handouts will be provided as we progress through the semester.

Other good resources:

Foundations of Statistical Natural Language Processing, Christopher D. Manning and Hinrich Schόtze, MIT Press

Statistical Language Learning, Eugene Charniak, MIT Press.

 

Prerequisites

Students must have basic programming skills. The use of open source software is encouraged as well as helping each other solve problems. You should have access to a computer for development and project demonstrations.

 

Grading Policy

Programming assignments/exercises/participation in class    80%
- Choose a Project 5%  
- Corpus Development  15%  
- Related Work 10%  
- Evaluation Plan 10%  
- Project Presentation  10%  
- Final Project  20%  
Class Discussions  10%
Comprehensive Exam 20%

(90/100 = A, 80/100 = B, 70/100 = C, below 70 = U)

 

Lectures and Academic Integrity

You are required to attend all lectures, including presentations. It is your responsibility to obtain material from a fellow student if you miss a lecture. Office hours are not meant as individual lectures.

Plagiarism, cheating or any type of dishonesty will result in a failing grade for this class and will be reported to the University.

 

Assignments and Discussions

During the course of the semester you will work on one major project. This project will be chosen during the first few weeks. You will be required to finish several, individual, smaller assignments for this project. Each assignment will build on the previous. It is imperative that you keep up.

During classes, we will also discuss each part of the project. You will be required to be present and prepared for the discussions. This counts towards your final grade. Students who do not take an active part in discussions may lose up to 1% of their grade per class.

All details will be handed out in class and will be made available online.

 

Tentative Course Outline 

 Date

 

 Topic

 Assignments

(dates are subject to change)

 Jan 23

 Introduction

 

 Jan 30

 

 Words

Class Discussion: Potential Projects 

 Read Chapters 2, 3, and 6 in Jurafsky

 Feb 13

 

Parts-of-Speech, Syntax

GATE (Tools)

 

Assignment Due – Project Choice

Tasks:

- Read Chapter 8

- Install GATE

Feb 27

 

Context Free Grammars

Ontologies

Discussion Assignment 2 (Corpus)

Tasks:

Read Chapters 9 and 10 in Jurafsky

 

Mar 12

 

Evaluation

Discussion: Evaluation Plan

Work: Apply Evaluation to Project

Assignment Due – Corpus Development

Mar 19 

SPRING BREAK

No Class

Apr 2

 

Related Work Presentations

 

Class Work: Apply Semantics to Projects

Assignment Due – Related Work

 

We meet in room: Harper 61

Apr 16 

COMPREHENSIVE EXAM

 

Apr 30

Semantics

 

Assignment Due – Evaluation Plan

Read Chapter 14

May 7

Project Presentations

Assignment Due – Project Presentations

Assignment Due – Final Project

We meet in room: Harper 61

May 14 

EXAM WEEK – No Class