Using the strtok() and strtok_r() functions in C

Filed Under: C Programming
Strtok Strtokr C

In this article, we’ll take a look at using the strtok() and strtok_r() functions in C.

These functions are very useful, if you want to tokenize a string. C provides these handy utility functions to split our input string into tokens.

Let’s take a look at using these functions, using suitable examples.


Using the strtok() function

First, let’s look at the strtok() function.

This function is a part of the <string.h> header file, so you must include it in your program.

#include <string.h>

char* strtok(char* str, const char* delim);

This takes in an input string str and a delimiter character delim.

strtok() will split the string into tokens based on the delimited character.

We expect a list of strings from strtok(). But the function returns us a single string! Why is this?

The reason is how the function handles the tokenization. After calling strtok(input, delim), it returns the first token.

But we must keep calling the function again and again on a NULL input string, until we get NULL!

Basically, we need to keep calling strtok(NULL, delim) until it returns NULL.

Seems confusing? Let’s look at an example to clear it out!

#include <stdio.h>
#include <string.h>

int main() {
    // Our input string
    char input_string[] = "Hello from JournalDev!";

    // Our output token list
    char token_list[20][20]; 

    // We call strtok(input, delim) to get our first token
    // Notice the double quotes on delim! It is still a char* single character string!
    char* token = strtok(input_string, " ");

    int num_tokens = 0; // Index to token list. We will append to the list

    while (token != NULL) {
        // Keep getting tokens until we receive NULL from strtok()
        strcpy(token_list[num_tokens], token); // Copy to token list
        num_tokens++;
        token = strtok(NULL, " "); // Get the next token. Notice that input=NULL now!
    }

    // Print the list of tokens
    printf("Token List:\n");
    for (int i=0; i < num_tokens; i++) {
        printf("%s\n", token_list[i]);
    }

    return 0;
}

So, we have our input string “Hello from JournalDev!”, and we’re trying to tokenize it by spaces.

We get the first token using strtok(input, " "). Notice the double quotes, as the delimiter is a single character string!

Afterwards, we keep getting tokens using strtok(NULL, " ") and loop until we get NULL from strtok().

Let’s look at the output now.

Output

Token List:
Hello
from
JournalDev!

Indeed, we seem to have got the correct tokens!

Similarly, let’s now look at using strtok_r().


Using the strtok_r() function

This function is very similar to the strtok() function. The key difference is that the _r means that this is a re-entrant function.

A reentrant function is a function that can be interrupted during its execution. This type of function can also be safely called again, to resume execution!

This is why it is a “re-entrant” function. Just because it can safely enter again!

Due to this fact, re-entrant functions are thread-safe, meaning that they can safely be interrupted by threads, just because they can resume again without any harm.

Now, similar to strtok(), the strtok_r() function is a thread-safe version of it.

However, this has an extra parameter to it, called the context. We need this, so that the function can resume from the right place.

NOTE: If you’re using Windows, the equivalent function is strtok_s(). strtok_r() is for Linux / Mac based systems!

#include <string.h>

char *strtok_r(char *str, const char *delim, char **context);

The context parameter is a pointer to the character, which strtok_r uses internally to save its state.

Usually, we can just pass it from a user-declared pointer.

Let’s look at the same example for strtok(), now using strtok_r() (or strtok_s() on Windows).

#include <stdio.h>
#include <string.h>

int main() {
    // Our input string
    char input_string[] = "Hello from JournalDev!";

    // Our output token list
    char token_list[20][20]; 

    // A pointer, which we will be used as the context variable
    // Initially, we will set it to NULL
    char* context = NULL;

    // To get the value of the context variable, we can pass it's address
    // strtok_r() to automatically populate this context variable, and refer
    // it's context in the future
    char* token = strtok_r(input_string, " ", &context);

    int num_tokens = 0; // Index to token list. We will append to the list

    while (token != NULL) {
        // Keep getting tokens until we receive NULL from strtok()
        strcpy(token_list[num_tokens], token); // Copy to token list
        num_tokens++;
        token = strtok_r(NULL, " ", &context); // We pass the context variable to strtok_r
    }

    // Print the list of tokens
    printf("Token List:\n");
    for (int i=0; i < num_tokens; i++) {
        printf("%s\n", token_list[i]);
    }

    return 0;
}

Output

Token List:
Hello
from
JournalDev!

While we get the same output, this version is better, since it is thread safe!


Conclusion

In this article, we learned about how we could use the strtok() and strtok_r() functions in C, to tokenize strings easily.

For similar content, do go through our tutorial section on C programming!

References


Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages