Home

# Final guide

You can bring a notecard (5in x 7in max) with notes or whatnot.

## Multi-agent systems

What are two principles for designing agent-based simulations?

### Prisoner’s dilemma

True or false? “Defecting” means confessing to the crime so that you get off the hook but your partner goes to jail.

Describe the “Jesus” strategy.

Describe the “Lucifer” strategy.

Describe the “Unforgiving” strategy.

Describe the “Tit-for-tat” a.k.a. “Moses” strategy. What is the first choice in this strategy? What are all future choices?

Which of these three strategies generally performed best across random scenarios in the iterated prisoner’s dilemma?

If two “Tit-for-tat” strategies face each other in the iterated prisoner’s dilemma, we will see behavior equivalent to which non-Tit-for-tat strategy or strategies facing each other?

## Learning

Describe the difference between “supervised” and “unsupervised” learning.

What does “10-fold cross validation” mean?

### k-means clustering

k-means clustering is an unsupervised or supervised learning strategy?

What does the choice of $$k$$ represent?

How does the choice of initial clusters affect the outcome?

Note: also be able to perform k-means clustering on some data as in Homework 4.

### k-nearest neighbor

What does k-nearest neighbor allow us to do with a new, unknown data point?

k-nearest neighbor is an unsupervised or supervised learning strategy?

What does the choice of $$k$$ represent?

What problem may a very small value of $$k$$ cause?

What problem may a very large value of $$k$$ cause?

Is there one value for $$k$$ that works best for nearly all data sets? If so, what is it?

Give one benefit of k-nearest neighbor learning.

Give one drawback of k-nearest neighbor learning.

Note: also be able to perform k-nearest neighbor classification on some data as in Homework 4.

### Classification evaluation

Define true positive (TP). Define false positive (FP). Define false negative (FN). Define precision (in terms of TP and/or FP and/or FN). Define recall (in terms of TP and/or FP and/or FN). Define F-score (in terms of precision and/or recall).

Suppose we make our classification engine more cautious; that is, it is less likely overall to predict any category. Does precision go up or down or remain unchanged? Does recall go up or down or remain unchanged?

What are the precision and recall for the following scenario:

• The true categories for some docs are:
• {noise, noise, signal, noise, signal}
• The predicted categories for the docs are (same ordering):
• {noise, signal, signal, signal, noise}

Consider “signal” to be a “positive” claim.

### Probability and Bayesian methods

Describe what $$P(a)$$ means (in words).

Describe what $$P(a,b)$$ means (in words).

Describe what $$P(a|b)$$ means (in words).

If events $$a$$ and $$b$$ are independent, and $$P(a) = 0.25$$, $$P(b) = 0.10$$, what is $$P(a,b)$$? What is $$P(b,a)$$?

In the toothache graph from the Bayesian inference notes (the graph with just c, g, and t), what is $$P(t|g)$$?

Write Bayes’ theorem.

Using algebra, derive Bayes’ theorem from the probability calculus equalities.

Suppose I know (or believe) that $$P(b|a)=0.1, P(a)=0.9, P(b)=0.25$$, what is $$P(a|b)$$?

In the toothache graph from the Bayesian inference notes (the graph with just c, g, and t), is $$P(g|t) > P(c|t)$$?

What is the outcome of computing $$\arg\max X (P(x))$$ where $$X$$ is an event? (If $$\arg\max_x (P(x))$$ was a function, what would the output of the function be?) Describe in English.

Suppose I know (or believe) that $$P(b|a)=0.1, P(a)=0.9, P(b|c)=0.2, P(c)=0.8$$, what is $$\arg\max_x (P(x|b))$$?

### Naïve Bayesian classification

Describe a “binary document vector” for a text document.

Why do we use logarithms for the calculations?

### Neural networks

What does “all or nothing” mean when we talk about neurons in the brain?

Explain the Hebbian learning rule.

What is happening when an artificial neural network is “learning”?

Generate the input/output table for this perceptron:

Draw perceptron and define inputs for the NAND function.

Define the word “epoch.”

Write the perceptron learning rule for a single weight. Define the variables you use.

Explain why the perceptron learning rule has $$d_j-y_j$$ and not $$y_j-d_j$$ (everything else being equal).

Give the “loss” function for logistic perceptrons and define the variables you use.

Give the logistic perceptron learning rule, with activation function $$1/(1+e^{-s_j})$$, where $$s_j$$ is the weighted sum of a perceptron’s inputs.

True or false: a single-layer perceptron network can compute any function.

Describe some differences, in terms of processing power and technique, between a human brain and typical personal computer.

Note: be able to apply the perceptron learning rule (with the threshold function) to a single perceptron with a small number of weights.

## Computer vision

Note: be able to apply a linear convolution kernel to a 3x3 pixel gray-scale image.

## Philosophy

Describe the Church-Turing thesis in a few sentences.

### The “Chinese room” argument

What is the essential goal of “strong AI?”

What is the most critical assumption in the Chinese room argument?

If you believe the Chinese room argument, can you also (reasonably) believe that passing the Turing test gives proof that a machine possesses a mind (i.e., can be said to truly understand things)?

### The “Norvig - Chomsky” debate

Give a 2-3 sentence summary of the debate.

### Robot ethics

Give two ethical issues related to the “take your medicine robot.”

## Extra credit

What is bigger than ant?