How to Read a File in Haskell

Sun, Dec 9, 2012 3-minute read

One of the myths about Haskell is that it makes difficult to perform even the simpler Input/Output operation.

Apparently, at least for basic needs, that’s not true. Here it is the code to read and print the content of a text file:

main = do x <- readFile "./jobs.txt"
        putStr x

It looks everything but hard to me. You get the content as a string and you can forget about how you obtained it.

If you want more details about how to read and write files in Haskell read the I/O Chapter of A Gentle Introduction to Haskell.

I am learning Haskell coding the solutions to Coursera’s algo2 class. The problem sets have a text file we have to read and parse in order to get the data our algorithm should work on. To make this post more meaningful I’m going to describe you how to parse the content of a file structured as follows:

NUMBER_OF_LINES
WEIGHT, LENGTH
...
WEIGHT, LENGTH

The first line contains the number of entries in the file while the other lines are the pairs of weights e lengths associated to each job.

To parse the job we need to define a couple of types:

-- Data type definitions. Strong typing rocks!
type Weight = Integer
type Length = Integer

data Job = Job Weight Length

getLength :: Job -> Length
getLength (Job _ job_length) = job_length

getWeight :: Job -> Weight
getWeight (Job job_weight _) = job_weight

If you are new with Haskell syntax you should know that type defines an alias for a type while data actualy creates a new structure combining other data types.

Function definition in Haskell requires two parts. At first you need to define the type of the function, specifying the input and output types. Remember that if you need multiple parameters the type of a function is a sequence of curried functions.

The actual implementation of the function has on the left side of the = the name of the function and its parameters. The instructions to compute the result of the function go, of course, on the right side of the =. Note that you can have multiple partial implementations that will be combined through pattern matching.

The main of your applications should look like this:

main = do
    content <- readFile "./jobs.txt"
    let jobs = parse content
    let scheduledJobs = schedule jobs
    print (cost scheduledJobs)

You have to read the file, parse its content and perform some operations on it to produce a result. In this post I will just describe how to parse the content. Sorry, I’m not going to do the coursera homework for you.

The parse function is pretty straightforward. It simply drops the first line, we don’t care about the number of jobs in the list, and parses each other line separately. Note that [T] is the type of a list of objects of type T.

parse :: String -> [Job]
parse content = do
    -- I drop the first line, it contains only the number of jobs
    let jobs_data = tail (lines content)
    map parseJob jobs_data

We are almost done. Now we only need to write the function that parses a job.

parseJob :: String -> Job
parseJob job_line = do
    let job_data = convert (splitOn " " job_line)
    Job (job_data!!0) (job_data!!1)

convert :: [String] -> [Integer]
    convert x = map read x

It is quite simple. It takes a string, splits it (You will need to import Data.List.Split) and converts its values using the read function. Note that I had to introduce the convert function to have the type inference algorithm figure out that read should produce an integer. I guess there is a better way to do that, but I am still a beginner and some quick search on the web didn’t help me finding how to properly to that.