Dr. Roger Ianjamasimanana

How to read a file in C?

By Dr. Roger Ianjamasimanana

1. What is file I/O in C?

File input/output (I/O) in C refers to the methods and functions that enable a program to interact with files. This includes opening files, reading data from them, writing data to them, and closing them after operations are complete.

2. How to open a file in C?

Before reading from a file, it must be opened using the fopen() function. This function returns a pointer to a FILE object that represents the opened file.

The syntax to open a file in C is:

FILE *fopen(const char *filename, const char *mode);

- filename: the name of the file to open.

- mode: the mode in which to open the file (e.g., "r" for reading).

Our file, example-file.txt has the following content:
Hello, world!
This is a test file.
I am learning a C programming language.

This file is located in the /home/ directory and we can open it like this:

FILE *file = fopen("/home/example-file.txt", "r");
if(file == NULL) {
    perror("Error opening file");
    return 1;
}
The perror() function prints a descriptive error message to stderr based on the current value of errno.

3. How to read a file in C?

Once the file is opened, you can read its contents using various functions such as fgetc(), fgets(), or fread().

In summary:

  • fgetc() is best for reading individual characters from a file.
  • fgets() is ideal for reading entire lines as strings.
  • fread() excels at reading binary data or large blocks of data efficiently.

3.1 Using fgetc()

The fgetc() function reads a single character from a file.

The basic syntax of fgetc() is:

int fgetc(FILE *stream);

where,

  1. FILE *stream
    • Type: Pointer to FILE structure (FILE *)
    • Description: Represents the file stream from which the character will be read.
    • Usage:
      • This pointer is obtained by opening a file using functions like fopen().
      • It must refer to a valid, open file in a mode that allows reading (e.g., "r", "rb").
  2. Return value
    • Type: int
    • Possible Returns:
      • Character Read: Returns the character read as an unsigned char cast to an int. This allows all possible characters to be represented, including EOF.
      • EOF: Returns the macro EOF (typically -1) if the end of the file is reached or if an error occurs during reading.
✍️
Important

Since fgetc() returns an int, you should use an int variable to store its return value to correctly handle all possible characters and detect EOF.

#include <stdio.h>

int main(void) {
    FILE *file = fopen("/home/example-file.txt", "r");
    if(file == NULL) {
        perror("Error opening file");
        return 1;
    }

    int ch;
    while((ch = fgetc(file)) != EOF) {
        putchar(ch);
    }

    fclose(file);
    return 0;
}

After completing file operations, it's crucial to close the file using the fclose() function to free resources and ensure data integrity.

Output:

Hello, world!
This is a test file.
I am learning a C programming language.

Now, let's use fgetc() to count the number of characters in a file.

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    FILE *file = fopen("/home/example-file.txt", "r");
    if (file == NULL) {
        perror("Error opening file");
        return EXIT_FAILURE;
    }

    int ch;
    long count = 0;

    while ((ch = fgetc(file)) != EOF) {
        count++;
    }

    if (ferror(file)) {
        perror("Error reading from file");
        fclose(file);
        return EXIT_FAILURE;
    }

    fclose(file);
    printf("Total number of characters: %ld\n", count);
    return EXIT_SUCCESS;
}

After running the above program, we get the following output:

Total number of characters: 118

3.2 Using fgets()

The function fgets() reads a string (a line of text) from a file, including whitespace and spaces, until a newline character or the specified limit is reached.

The basic syntax of fgets() is

char *fgets(char *str, int n, FILE *stream);

The parameters are:

  1. char *str
    • Type: pointer to a character array (buffer).
    • Description: Destination buffer where the read string will be stored.
    • Usage: Must be large enough to hold the expected line plus the null terminator.
  2. int n
    • Type: integer.
    • Description: maximum number of characters to read (including the null terminator).
    • Usage: defines the buffer size to prevent buffer overflows.
  3. FILE *stream
    • Type: pointer to a FILE object.
    • Description: represents the file stream from which to read the string.
    • Usage: obtained by opening a file using fopen() in a mode that allows reading.

Return value

  • char *
    • Success: returns the pointer to the string (str) if successful.
    • End of File or Error: returns NULL if end of file is reached before any characters are read or if an error occurs.

You can see the content of the example-read-file.txt that we want to read.

#include <stdio.h>

int main(void) {
    FILE *file = fopen("/home/example-read-file.txt", "r");
    if(file == NULL) {
        perror("Error opening file");
        return 1;
    }

    char buffer[100];
    while(fgets(buffer, sizeof(buffer), file) != NULL) {
        printf("%s", buffer);
    }

    fclose(file);
    return 0;
}
After running the above program, we get the following output:
This is the first line.
This is the second line. 
This is the third line.

3.3 Using fread()

The fread() function reads binary data from a file into memory. It can also be used to read a text file. The general syntax of fread() is

 size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream);

Explanation of the parameters:

  1. void *ptr
    • Description: a pointer to a block of memory where the read data will be stored.
    • Usage: this is typically an array or a buffer where the data from the file will be copied.
  2. size_t size
    • Description: the size in bytes of each element to be read.
    • Usage: for example, sizeof(int) if you're reading integers, sizeof(char) for characters, etc.
  3. size_t nmemb
    • Description: the number of elements, each of size size, to be read.
    • Usage: if you want to read an array of 100 integers, nmemb would be 100.
  4. FILE *stream
    • Description: a pointer to a FILE object that specifies the input stream (file) to read from.
    • Usage: This is the file you have opened using fopen() in read ("rb") or read/update ("r+b") mode.

Return value

size_t
  • Description: The total number of elements successfully read, which should be equal to nmemb if the read was successful.
  • Usage: If the return value is less than nmemb, it indicates that either an error occurred or the end of the file (EOF) was reached before reading the desired number of elements.

Let's read the binary data example-binary-data.bin using the following C program:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    FILE *file = fopen("/home/example-binary-data.bin", "rb"); // Open file in read-binary mode
    if(file == NULL) {
        perror("Error opening file");
        return EXIT_FAILURE;
    }

    // Determine file size
    fseek(file, 0, SEEK_END);
    long filesize = ftell(file);
    rewind(file);

    // Calculate the number of integers
    size_t num_integers = filesize / sizeof(int);

    // Allocate memory to hold the integers
    int *buffer = malloc(filesize);
    if(buffer == NULL) {
        perror("Memory allocation failed");
        fclose(file);
        return EXIT_FAILURE;
    }

    // Read the integers into the buffer
    size_t elements_read = fread(buffer, sizeof(int), num_integers, file);
    if(elements_read != num_integers) {
        perror("Error reading from file");
        free(buffer);
        fclose(file);
        return EXIT_FAILURE;
    }

    // Print the integers
    printf("Numbers read from binary file:\n");
    for(size_t i = 0; i < num_integers; i++) {
        printf("%d ", buffer[i]);
    }
    printf("\n");

    // Cleanup
    free(buffer);
    fclose(file);
    return EXIT_SUCCESS;
}

Output:

Numbers read from binary file:
10 20 30 40 50

5. Reading data by column

When dealing with files that contain structured data, such as numbers arranged in multiple columns, you may often need to read and process specific columns. This is particularly useful in data analysis, where each column may represent a different variable or parameter.

5.1 structured data file example

Consider a file named data.txt with the following content, representing five columns of numbers:

1  2  3  4  5
6  7  8  9 10
11 12 13 14 15
16 17 18 19 20

Our goal is to read the second column (values 2, 7, 12, 17) from this file.

To read a specific column from a structured data file, you can follow these steps:

  1. Open the file using fopen().
  2. Read the file line by line using fgets().
  3. Parse each line to extract the desired column using sscanf() or strtok().
  4. Store or process the extracted column data as needed.
  5. Close the file using fclose().

Here's a complete C program that reads the second column from data.txt and prints the extracted values:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_LINE_LENGTH 100
#define TARGET_COLUMN 2 // Column index starts at 1

int main(void) {
    FILE *file = fopen("/home/data.txt", "r");
    if(file == NULL) {
        perror("Error opening file");
        return 1;
    }

    char line[MAX_LINE_LENGTH];
    int column_value;

    printf("Second column values:\n");

    while(fgets(line, sizeof(line), file) != NULL) {
        // Using sscanf to extract the second column
        if(sscanf(line, "%*d %d", &column_value) == 1) {
            printf("%d\n", column_value);
        } else {
            fprintf(stderr, "Error parsing line: %s", line);
        }
    }

    fclose(file);
    return 0;
}

Output:

Second column values:
2
7
12
17

Now le'ts use strtok() to achieve the same result as above.

We tokenize each line to extract the desired column by using the strtok() function.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_LINE_LENGTH 100
#define TARGET_COLUMN 2 // Column index starts at 1

int main(void) {
    FILE *file = fopen("/home/data.txt", "r");
    if(file == NULL) {
        perror("Error opening file");
        return 1;
    }

    char line[MAX_LINE_LENGTH];
    char *token;
    int current_column;
    int column_value;

    printf("Second column values:\n");

    while(fgets(line, sizeof(line), file) != NULL) {
        current_column = 1;
        token = strtok(line, " \t\n");
        while(token != NULL) {
            if(current_column == TARGET_COLUMN) {
                column_value = atoi(token);
                printf("%d\n", column_value);
                break;
            }
            current_column++;
            token = strtok(NULL, " \t\n");
        }
        if(current_column != TARGET_COLUMN) {
            fprintf(stderr, "Line does not have %d columns: %s", TARGET_COLUMN, line);
        }
    }

    fclose(file);
    return 0;
}

We get the following output:

Second column values:
2
7
12
17

8. C File I/O functions summary

Functions Prototypes Purposes Typical use cases
fgetc()
int fgetc(FILE *stream);
Reads a single character from a file. Character-by-character processing, simple traversals.
fgets()
char *fgets(char *str, int n, FILE *stream);
Reads a string (line) from a file. Line-by-line processing, reading configuration files.
fread()
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream);
Reads binary data into memory. Binary file processing, efficient bulk data reading, struct serialization.

9. Best practices for reading files in c

  • Always check if the file was opened successfully: before performing any operations, ensure that the file was opened without errors.
  • Use appropriate modes: open files in the correct mode ("r" for reading, "w" for writing, etc.) to prevent data loss.
  • Handle errors efficiently: provide meaningful error messages and handle exceptions to improve user experience.
  • Close files after operations: always close files to free resources and maintain data integrity.
  • Limit buffer sizes: when using functions like fgets(), ensure that buffers are large enough to hold expected data.
  • Consider the size of the file: for large files, read data in chunks rather than loading the entire file into memory.
  • Use binary mode for non-text files: when dealing with binary data, open files in binary mode ("rb" or "wb") to prevent data corruption.

feature-top
Readers’ comment
feature-top
Log in to add a comment
🔐 Access