In this article, we’ll take a look at using the strtok() and strtok_r() functions in C.
These functions are very useful, if you want to tokenize a string. C provides these handy utility functions to split our input string into tokens.
Let’s take a look at using these functions, using suitable examples.
Using the strtok() function
First, let’s look at the strtok() function.
This function is a part of the <string.h>
header file, so you must include it in your program.
#include <string.h>
char* strtok(char* str, const char* delim);
This takes in an input string str
and a delimiter character delim
.
strtok()
will split the string into tokens based on the delimited character.
We expect a list of strings from strtok()
. But the function returns us a single string! Why is this?
The reason is how the function handles the tokenization. After calling strtok(input, delim)
, it returns the first token.
But we must keep calling the function again and again on a NULL
input string, until we get NULL
!
Basically, we need to keep calling strtok(NULL, delim)
until it returns NULL
.
Seems confusing? Let’s look at an example to clear it out!
#include <stdio.h>
#include <string.h>
int main() {
// Our input string
char input_string[] = "Hello from JournalDev!";
// Our output token list
char token_list[20][20];
// We call strtok(input, delim) to get our first token
// Notice the double quotes on delim! It is still a char* single character string!
char* token = strtok(input_string, " ");
int num_tokens = 0; // Index to token list. We will append to the list
while (token != NULL) {
// Keep getting tokens until we receive NULL from strtok()
strcpy(token_list[num_tokens], token); // Copy to token list
num_tokens++;
token = strtok(NULL, " "); // Get the next token. Notice that input=NULL now!
}
// Print the list of tokens
printf("Token List:\n");
for (int i=0; i < num_tokens; i++) {
printf("%s\n", token_list[i]);
}
return 0;
}
So, we have our input string “Hello from JournalDev!”, and we’re trying to tokenize it by spaces.
We get the first token using strtok(input, " ")
. Notice the double quotes, as the delimiter is a single character string!
Afterwards, we keep getting tokens using strtok(NULL, " ")
and loop until we get NULL
from strtok()
.
Let’s look at the output now.
Output
Token List:
Hello
from
JournalDev!
Indeed, we seem to have got the correct tokens!
Similarly, let’s now look at using strtok_r()
.
Using the strtok_r() function
This function is very similar to the strtok()
function. The key difference is that the _r
means that this is a re-entrant function.
A reentrant function is a function that can be interrupted during its execution. This type of function can also be safely called again, to resume execution!
This is why it is a “re-entrant” function. Just because it can safely enter again!
Due to this fact, re-entrant functions are thread-safe, meaning that they can safely be interrupted by threads, just because they can resume again without any harm.
Now, similar to strtok()
, the strtok_r()
function is a thread-safe version of it.
However, this has an extra parameter to it, called the context. We need this, so that the function can resume from the right place.
NOTE: If you’re using Windows, the equivalent function is strtok_s(). strtok_r() is for Linux / Mac based systems!
#include <string.h>
char *strtok_r(char *str, const char *delim, char **context);
The context
parameter is a pointer to the character, which strtok_r
uses internally to save its state.
Usually, we can just pass it from a user-declared pointer.
Let’s look at the same example for strtok()
, now using strtok_r()
(or strtok_s()
on Windows).
#include <stdio.h>
#include <string.h>
int main() {
// Our input string
char input_string[] = "Hello from JournalDev!";
// Our output token list
char token_list[20][20];
// A pointer, which we will be used as the context variable
// Initially, we will set it to NULL
char* context = NULL;
// To get the value of the context variable, we can pass it's address
// strtok_r() to automatically populate this context variable, and refer
// it's context in the future
char* token = strtok_r(input_string, " ", &context);
int num_tokens = 0; // Index to token list. We will append to the list
while (token != NULL) {
// Keep getting tokens until we receive NULL from strtok()
strcpy(token_list[num_tokens], token); // Copy to token list
num_tokens++;
token = strtok_r(NULL, " ", &context); // We pass the context variable to strtok_r
}
// Print the list of tokens
printf("Token List:\n");
for (int i=0; i < num_tokens; i++) {
printf("%s\n", token_list[i]);
}
return 0;
}
Output
Token List:
Hello
from
JournalDev!
While we get the same output, this version is better, since it is thread safe!
Conclusion
In this article, we learned about how we could use the strtok() and strtok_r() functions in C, to tokenize strings easily.
For similar content, do go through our tutorial section on C programming!
References
- Linux manual page on strtok() and strtok_r() functions in C