Persistent data
Persistent data allows an application to maintain state beyond its life.
Reading and writing to files
Before interacting with a file, it must be opened. The open function takes two parameters, the location of the file, and the intended way of interacting with it.
>>> text_file = open("output.txt", "w")
open()
returns a file object that provides methods for working with the file. Data can be written to the file with the write()
method.
>>> line_one = "This is the first example line.\n"
>>> text_file.write(line_one)
After writing to the file, it should be closed to prevent any bugs or memory leaks.
>>> text_file.close()
Filenames and paths
Files are organised into directories (folders). Every running program has a current directory, in which default operations run. For example, a file will be automatically created in the local directory unless otherwise specified.
The os
module can be used to work with files and directories.
>>> import os
>>> current_working_directory = os.getcwd()
>>> print(current_working_directory)
'/Users/Kayra'
The output of getcwd()
is a path. A simple filename, like output.txt
, is called a relative directory, because it relies on the current directory to contextualise it. In this case the full path would be /Users/Kayra/output.txt
os.path
provides other functions for working with filenames and paths. exists()
checks whether a file or directory exists.
>>> print(os.path.exists("output.txt"))
True
isdir()
checks if it is a directory. isfile()
checks if it is a file.
>>> print(os.path.isdir("output.txt"))
False
>>> print(os.path.isdir("/Users/Kayra"))
True
>>> print(os.path.isfile("output.txt"))
True
>>> print(os.path.isfile("/Users/Kayra"))
False
listdir()
returns a list of files and directories in the provided directory.
>>> print(os.listdir(current_working_directory))
["output.txt", "development", "photos", "test.py"]
To demonstrate these methods, a custom function tree()
can be created to recursively traverse through all of the child directories and print their contents.
def tree(base_directory):
for directory in os.listdir(base_directory):
path = os.path.join(base_directory, directory)
if os.path.isfile(path):
print(path)
else:
tree(path)
Databases
A database is a file that is designed to store and organise data. The module dbm
provides an interface for creating and updating database files.
>>> import dbm
>>> database = dbm.open("captions", c)
>>> database["cleese.png"] = "Photo of John Cleese."
>>> database["cleese.png"]
b'Photo of John Cleese.'
The object provided when reading data back from the database is a bytes object, denoted by the preceding b.
Data in a database can be changed and looped through as expected. Like files, databases should be closed after use.
>>> database["cleese.png"] = "Photo of John Cleese doing a silly walk."
>>> database["cleese.png"]
b'Photo of John Cleese doing a silly walk.'
>>> for key in database:
... print(key, database[key])
>>> database.close()
Pipes
Pipes can be used to run commands in the operating system using Python.
The pipe object represents a running program, and can be used with the os.popen()
method. It can be used similarly to a file, reading the output one line at a time with read()
and the final status with close()
(which is usually nothing for the execution of a successful action).
>>> import os
>>> file_name = "book.txt"
>>> checksum_command = "md5sum"
>>> full_command = file_name + " " + checksum_command
>>> pipe_object = os.popen(full_command)
>>> result = pipe_object.read()
>>> final_status = pipe_object.close()
>>> print(result)
1e0033f0ed0656636de0d75144ba32e0 book.txt
>>> print(final_status)
None
Pickling
A drawback of dbm
is that the keys and values have to be strings or bytes. The pickle
module translates almost any type of object into a string suitable for storage in a database, and translates strings back into objects.
pickle.dumps
takes an object as a parameter, and returns a byte string representation. This byte string representation can be given back to pickle.loads
to recreate the object.
>>> import pickle
>>> list_of_numbers = [1, 2, 3]
>>> list_of_numbers_data = pickle.dumps(list_of_numbers)
>>> print(list_of_numbers_data)
b'\x80\x03]q\x00(K\x01K\x02K\x03e'
>>> new_list_of_numbers = pickle.loads(list_of_numbers_data)
>>> print(new_list_of_numbers)
[1, 2, 3]
>>> list_of_numbers == new_list_of_numbers
True
>>> list_of_numbers is new_list_of_numbers
False