Homework 4 notes
For Homework 4, Problem 1 was randomly selected for instructor grading. This notebook contains my notes from general trends in the class’s code and interesting questions raised in your reflections.
The topics covered in the notes include:
- Good use of AI-generated code
- Right answers for wrong reasons
Goode use of AI-generated code
I saw a lot of good examples of people using AI-generated code! There were two main flavors of answers from AI-generated code for Problem 1: concise or verbose. Concise versions generally looked something like this:
def get_pendulum_data(data_file):
with open(data_file, 'r') as f:
first_line = f.readline().strip()
dt = float(first_line.split(':')[1].split()[0])
data = np.loadtxt(data_file, delimiter=',', skiprows=3)
x = data[:,0]
y = data[:,1]
return dt, x, yThis code is easy to read and doesn’t do anything too fancy.
More verbose AI-generated code solutions looked like this:
def get_pendulum_data(data_file):
"""Read pendulum CSV file and return (dt, x, y)."""
with open(data_file, 'r') as f:
lines = f.read().splitlines()
dt = None
for ln in lines[:5]:
s = ln.strip()
if not s:
continue
m = re.search(r'[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?', s)
if m:
try:
dt = float(m.group(0))
break
except ValueError:
pass
if dt is None:
raise ValueError('Could not parse dt from file header')
data_start = None
for i, ln in enumerate(lines):
s = ln.strip()
if not s:
continue
parts = [p.strip() for p in s.split(',') if p.strip()!='']
if len(parts) >= 2:
try:
_ = float(re.search(r'[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?', parts[0]).group(0))
_ = float(re.search(r'[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?', parts[1]).group(0))
data_start = i
break
except Exception:
continue
if data_start is None:
raise ValueError('Could not find numeric data in file')
data_text = '\n'.join(lines[data_start:])
xy_data = np.loadtxt(io.StringIO(data_text), delimiter=',')
x = xy_data[:, 0]
y = xy_data[:, 1]
return dt, x, yThere’s nothing wrong with the verbose version, but it’s certainly harder to read. And most of what it’s doing is evaluating the format of the csv file to see what’s in there. A lot of students noted that their simple solutions to extracting the data limit what kind of file structures could be opened with it. That’s a great point! The cost of trying to generalize is much longer code.
Even though I don’t prefer the verbose code, there were some excellent submissions that used it. For example, I saw some students who had verbose code, but explained in their reflections how they checked to make sure what the function returned was what they wanted. That’s great approach!
An even better approach that some took is to get copilot to do better. By breaking down the problem and asking copilot to complete smaller sized tasks, some students were able to get something much more readable than the verbose code.
I’d be interested to know what people used for their prompts!
Right answers for the wrong reasons
I think everyone who had an answer for the bonus problem came to the right conclusion (the data were recorded in different units) but for the wrong reasons. The approaches all had some merit, but made assumptions that would have concluded that the data were in different units even if they weren’t. Setting the bonus problem itself aside for a moment, the idea of getting the right answer for the wrong reasons is worth thinking about.
The danger with getting the right answer for the wrong reasons is that it gives a false sense of confidence in our solution. This may lead us to re-use the solution in another situation where the wrong reasoning (that we’re unaware of) gives us a wrong answer that we assume is right.
The only way to truly avoid this is to conduct rigorous testing that challenges all of our assumptions and shows their limitations. We won’t get into formal software testing in this course. And it’s often not practical to challenge every single assumption we make, even if we know how to do that.
So how to balance the need to check our assumptions with the impracticality of writing exhaustive tests for every problem we solve? It will probably not be a surprise that I think asking Copilot provides a good balance here. Copilot won’t catch every mistake you make in your assumptions, but it does provide the “second set of eyes” that are often critical to spotting the errors we’re blind to.
In my work, I use prompts like this:
Check the assumptions in my approachIs my approach likely to lead to the right solution?Will my approach lead to false positives? false negatives?
I suggest that you try these kinds of prompts to see what you get. Try asking the question a few different ways, too. As a point of reference, I asked Copilot similar questions about several of the right-answer-wrong-reason submissions for the Bonus problem and it was able to identify the incorrect assumptions and modify them to better represent the problem.