Unrecoverable EdTech Error

Val Tenyotkin
Kiddom Engineering
Published in
7 min readJan 29, 2018

--

I worked at three education technology companies and am familiar with the inner workings of a few more. By some virtue or vice, the same mistakes surface in many early-stage EdTech startups. Those omissions of judgement are eerily similar across the industry and represent the oft-discussed tech debt, which takes years to retire; slowing the feature development, scaling, and hence growth. Outlined here are pitfalls, warnings, and tested ideas for companies entering the education space. These issues are related to EdTech in particular, I am not going to lecture you on micro-services, CI/CD, and back end/front end segregation.

“The worst decisions are the ones you do not make.” — Unknown

Lesson I: All Users Are Created Equal

Database schemas of early-stage EdTech startups tend to separate data structures — tables in a (*SQL) database — for, say, teachers and students. That is a mistake. Everything from signup and login to analytics and administration increases in complexity by a factor comparable to the number of distinct user data structures.

How many lines of code must be added or changed to create a new type of user and give them access to several resources?

On signup, for instance, the table of every user type must be consulted to establish email/username uniqueness. If a teacher creates an assignment, the students along with the TA are be notified via email. The emails for the students and the TAs are not stored in the same table, so instead of one simple query, two are likely required. Imagine now that a user type ‘Parent’ is introduced. Every query must be modified to deal with this user type and JOINed on the parent table. Say a user leaves a comment, this newly-created object must be tagged with the user ID and user type to uniquely attribute it. Multiply these simple examples by several dozen endpoints and you’ve got a sluggish codebase which to iterate upon.

To save many engineer-months, consider this user data structure users:

| User ID |   Name   |  Email  | ... |  Role   |
+---------+----------+---------+-----+---------|
| 1 | Bob | b@c.com | | Teacher |
| 2 | Judy | j@p.org | | Student |
| 3 | Lee | l@c.net | | Teacher |

The Role determines users’ access to various application resources. Having a unique user ID also simplifies object ownership verification and storage sans additional columns:

protocol://your.storage.solution/photos/{f(user_id)}.jpg

Lesson II: Nothing Is Owned By The Teacher

If a teacher leaves a school, the class (s)he taught doesn’t disappear and the lock on the door isn’t permanently sealed with black magic. The comments (s)he left on students’ work don’t evaporate. While this is common sense, rarely to never does any engineer/PM/designer realize it in their initial work. A teacher object does not own a class, a roster, a student object, a grade book, nor the comments (s)he leaves on students’ assignments. Teacher is merely associated with a class, whereas classes — and other similar entities — are master-less objects existing on their own, just like in real life.

A teacher is sick, the substitute can’t access the syllabus.

Consider users_classes table which contains nothing except the association of a user and a class:

| User ID | Class ID | Class Role |
+---------+----------+------------|
| 1 | 56 | Teacher |
| 2 | 301 | TA |
| 3 | 207 | Teacher |
| 103 | 301 | Student |

If one, for instance, desires to fire off a notification to a list of individuals in a given class, here are their emails:

SELECT users.email
FROM users
INNER JOIN users_classes ON
users_classes.user_id = users.id
WHERE
users_clsses.class_role IN ('Teacher', 'TA')
AND users_classes.class_id = 301;

Note. Class role is not application role — and, in fact, is not always necessary depending upon the focus of the application — allowing a student to be a TA, which is a common practice. Access to resources can be granted based on the combination of application role and class role in a class or any other entity:

SELECT
class_role
FROM users_classes
WHERE
class_id = 301
AND user_id = 2

If the return is empty, this user is not authorized to access the class. Otherwise decide if any of the roles returned have access to a given resource: while a teacher has access to the grade book, TA and Student only have access to the assignment calendar. One can generalize this association structure to:

| User ID | Object ID |   Object   |   Role    |
+---------+-----------+------------|-----------|
| 1 | 56 | Class | Teacher |
| 2 | 301 | Curriculum | Author |
| 3 | 207 | Comment | Author |
| 103 | 301 | Class | TA |

This brings us to the next point…

Lesson III: A Class Can Have More Than One Teacher

“How do I give another teacher access to my class?”, a teacher asked;

“Share your login credentials with them,” customer support answered.

This is a notorious piece of tech debt every startup spends months paying off because original schemas only allow 1:1 teacher:class mapping, which does not bear out in reality due to co-teaching being a common practice. The users_classes structure above allows multiple teachers/TAs/co-teachers per class. Thus a corollary…

Lesson IV: Everything Is An Array/List

For as long as I’ve been in the industry, countless engineer-weeks have been spent on this exact problem:

“We thought there can only be one X in Y, but it turns out there can be multiple X in Y” — Every EdTech PM.

Some schools have multiple principals, multiple campuses, multiple grades for the same assignment, multi-departmental (a.k.a. cross-listed) classes, etc. If it can be practicably and cheaply made into an array: do so from the start and avoid the bug-a-palooza of modifying the entire schema and code to allow for multiple memes to be attached to an assignment. Front end can ignore elements except item[0], but the back end and the database schema must allow multiples of everything.

Lesson V: Students Will Not Remember Their Passwords

The younger they are, the more likely this will be the case. When I taught middle school, I had to quickly give up on the idea of sending my students email because addresses change several times per semester. It goes like this:

  1. Student signs up for an account; bangs on the keyboard when prompted for password.
  2. Loses/Damages/Upgrades phone, gets a new one.
  3. Tries to login, forgot password.
  4. Creates a new account.
  5. Phone is taken away for misbehavior.
  6. Can’t login because password is stored on the phone.
  7. Gets a new account.
  8. etc.
  • Nope. Biometrics? No lawyer will allow any semblance of a minor’s fingerprint, face, retinal scan, or voice to be stored anywhere, ever. No matter how irreversibly hashed/processed it is. Apple allegedly tried it and dropped the effort due to legal concerns.
  • Nope. PIN based on a student ID or some other easily ascertainable number? Well, it’s easily ascertainable, which means that another individual can ascertain it just as easily.
  • Nope. QR code? Kids lose everything, all of the time, guaranteed. Students are also skilled at taking photos and will have little difficulty duplicating another student’s QR badge.
  • Yes. If your EdTech product involves a student login, the feature to build immediately after student login is the student password/PIN reset by the teacher. Not convoluted email-confirmation-change-token thing; click a button, enter student’s new password/PIN, press ENTER. That’s it. A teacher will not spend more than a minute resetting a student’s password.
  • Yes. The second feature to build is force-login: teacher walks up to the student’s computer/tablet/phone, flashes a master QR code or enters master password, selects a student from a list, student is logged in.
  • Yes. The third feature to build is mobile sign in via a [variable] QR code from a web portal, the reverse of what WhatsApp does. This isn’t for students only; nobody likes typing passwords on their mobile devices.

You’re balancing security against time. A perfect vault is easy to build because it has no doors. That said…

Lesson VI: Don’t get inBloomed

Unlike literally every other industry, even a perception of a sliver of a hint of a contemplative notion of a data breach can bring down EdTech companies. To that end, security cannot be a mere afterthought of a project, it has to be tightly coupled to it. “Well” — one says — “We don’t sell student information, we have awesome passwords, and we do everything Kiddom does. We’ll be fine!”

Typical app broadcasts information to a myriad of services: Intercom, Segment, Keen, etc. Does the front end of your application withhold names and email addresses of students from tracking services? Do your retargeting trackers turn off for minors? Are tracking pixels attached to emails sent to students? Does Periscope know that Judy Smith got a 98 on her final exam in Mr. Tompkin’s math class? Do Loggly and Rollbar know Judy’s near real-time location from her IP? And most importantly: do any of these companies know what FERPA and COPPA stand for?

Bonus points if you don’t remove GPS information from uploaded images. Double bonus points if your customer support, data backup, log aggregator, or engineering contractors are located in a non-extradition jurisdiction.

Allowing student PII (Personally Identifiable Information) to flow to third parties is one of the most common non-decisions made by young companies. These lapses of judgement will bury you. Tread lightly.

[K-12] Education exists in legal and economic twilight zones, where the victims of your screw-ups are not your customers (children). Your customers aren’t the ones paying you (teachers). And those who pay you (schools and districts) will ultimately be held accountable while everyone and their superintendent are distancing themselves from you faster than a SpaceX Rocket. Insolvency and illiquidity awaits.

--

--