Alright, so you feel comfortable with relational database theory. Now you actually need to set up a database. What will that look like? In this tutorial, we’ll walk through setting up a PostgreSQL database on your computer, why it’s useful, and how to populate it with data.

What’s a local database?

A local database is one which is hosted on the same computer as the one that’s trying to access the database. They are not and should not be made to be publicly accessible for security purposes. However, they’re very useful for people who are trying to work with more…


This is part 2 in a series of write-ups based on common questions I get in TA hours for Visual Analytics at Smith College. Feel free to reach out to me and request future topics! This reading will be helpful for people who are trying to analyze written text. A note — code examples will be written in Python in this tutorial, but there are similar tools available in R.

File Encodings

The first and most important step of analyzing a text is reading it into memory. No one could disagree with that one. However, reading text is a bit…


This is part 1 in a series of write-ups based on common questions I get in TA hours for Visual Analytics at Smith College. Feel free to reach out to me and request future topics! This reading will be helpful for people who are trying to structure their findings when doing data analysis.

How can you model data? (A.K.A. the quickest introduction to database theory possible)

Let’s say that you have millions of data points and you want to find some way to store them. How are you going to do it?

If there are common attributes about all of…


Accessing a remote machine might seem like a complicated process, but going step-by-step you can do it easily. This tutorial is aimed at students who have taken or are in the midst of taking a class on Python and was written for students at Boise State University. If you’re a data science student, there’s a good chance that you’re using a remote machine in order to do your work. However, if you’re stuck behind a jump server, the process of accessing the machine can seem daunting. Here’s the exact process in order to get it done! …


William of Ockham, creator of the idea of Occam’s Razor

This article is written as the final project for Theory of Computation at Smith College, Spring of 2020

Introduction

Random is a tricky concept to define. Imagine you have to pick a random word — is the word “apple” random? Is the word “antidisestablishmentarianism” random instead? If one is going solely by numbers, is “101111010100” random instead of “000000000000”. Technically, if one were to flip a coin 12 times, the latter is just as likely to occur as the former, but it also is clearly determined by a pattern. It can be rewritten as “write 0*12”, a description which is shorter…


Panama Papers represented in neo4j

Are you interested in graph theory and NoSQL database design? Graph databases may offer you the tools that you need. Whether it comes to pattern matching, recommender systems, fraud detection, social media, or more, graph databases are a great option. Neo4j is the most popular open-source graph database management system available to the public. In the following article, we’ll discuss graphs, relational databases, neo4j, and py2neo for the beginners. If you’ve taken an introductory python course [CSC 111 at Smith College], you’ve got all the tools you need to get started.

So what is a graph?

If you’ve taken discrete…

Ananda Montoly

Programmer and student at Smith College in Northampton, Massachusetts. Aspiring data scientist.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store