Input/output in POSIX
Both the C library and the POSIX standard library contain file I/O functions. Usually, the C library ones build on the POSIX ones if you are on a POSIX system.
The POSIX I/O functions are declared in fcntl.h
and unistd.h
. The main ones are:
Open and Close
int open(const char *pathname, int flags, [mode_t mode])
opens a file and returns a file descriptor. The flags are the boolean OR (|
) of one or more of the following:O_CREAT
: if the file does not exist, create it. In this case the optional mode parameter can be used to specify the permissions for the newly created file. If theO_CREAT
flag is omitted and the file does not exist,open
returns an error.O_EXCL
: return an error if the file already exists (only really useful together withO_CREAT
).O_PATH
: don't really open the file, just make a file descriptor (that you can use to e.g.stat
the file). In this case you do not need one of the three flags mentioned in the next point.O_RDONLY
,O_WRONLY
,O_RDWR
: exactly one of these mutually exclusive flags must be set to indicate whether you want read, write or both access modes on the file.O_TRUNC
: if the file already exists, then its previous contents are overwritten if you open it to write.O_NONBLOCK
: this is the most interesting option and the main reason why you might want to useopen
overfopen
. More details on this in the next activity.- There are a lot more flags that you can see with
man 2 open
.
-
int close(int fd)
: close a file descriptor. Returns 0 on success and -1 (setting errno) on error.
You might wonder how a C function call can have an optional parameter. The function is actually declared in fcntl.h
as int open(const char*, ...)
where the ellipsis means any number of parameters of any type just like printf
uses.
If open()
fails, it returns a negative number that indicates the error code: see errno.h
for the names of the codes and the manual page (man 2 open
) for what each one means for this function.
(Non)blocking IO
When you read from a regular file, there is a short delay while the kernel does the actual reading for you - possibly it has to fetch the data from disk. This is not what is meant by blocking. However, when you read from a pipe (e.g. you call a | b
in the shell and b
reads from standard input) or from a named pipe (FIFO), then if no process is connected to the writing end, your system call blocks until someone writes to the pipe.
The Posix functions offer an option O_NONBLOCK
that makes a call fail instead of block. This is one of the reasons that you might use read
directly rather than fread
. While this is an important topic for concurrent programming (and especially network programming), that can wait until second year.
Read and Write
ssize_t read(int fd, void *buf, size_t count)
: read up tocount
bytes from the file descriptorfd
into the buffer. In case of an error, this returns -1 and puts the error code inerrno
. If the end of file is reached, then the return value might be less than count (possibly zero) but this does not count as an error.ssize_t write(int fd, const void *buf, size_t count)
: the same but for writing to a file (descriptor).
Before you use any of these functions, there are two warnings. The small one is that unlike the C library which provides buffered input/output, the POSIX functions do not - so while it's ok to read a large file one character at a time with fread
due to buffering, this becomes really inefficient if you use the basic read
.
The big warning is that checking return values is even more important than usual because these functions can return two kinds of errors: fatal ones (e.g. file not found) and temporary ones, which basically means try again, so in some cases the correct pattern to use these functions is to call them in a loop that repeats until the function either succeeds or returns a fatal error. Specifically, read()
can return any of the following:
- -1,
errno = EINTR
: the call was interrupted, try again. - -1,
errno = EAGAIN
orerrno = EWOULDBLOCK
: you have requested non-blocking IO, and the call would block. - -1, any other errno: a fatal error occurred and
errno
gives more information. You must not retry the call.
Exercise
In the sources for busybox (git clone git://busybox.net/busybox.git
), use your shell skills to find the source file and line where the function safe_read
is defined, and study the pattern used there to operate read
in a loop.