C Programming/string.h/Function reference
memset
[edit | edit source]Synopsis
[edit | edit source]#include <string.h>
void *memset(void *s, int c, size_t n);
The memset() copies c into each of the n first bytes of the object pointed to by s.
Return value
[edit | edit source]The function returns the pointer s. There is no defined value to return when an error occurs.
Source
[edit | edit source]strcat
[edit | edit source]In computing, the C programming language offers a library function called strcat that allows one memory block to be appended to another memory block. Both memory blocks are required to be null-terminated. Since, in C, strings are not first-class datatypes, and are implemented as blocks of ASCII bytes in memory, strcat will effectively append one string to another given two pointers to blocks of allocated memory. The name strcat is an abbreviation of "string concatenate". strcat is found in the string.h header file.
For example:
char str1[14] = "Hello,"; /* The array has enough space for 'Hello,' plus " world!" plus a null terminator */
strcat(str1, " world!");
puts(str1); /* prints "Hello, world!" to stdout followed by a newline */
Here is a possible implementation of strcat:
char *
strcat(char *dest, const char *src)
{
size_t i,j;
for (i = 0; dest[i] != '\0'; i++)
;
for (j = 0; src[j] != '\0'; j++)
dest[i+j] = src[j];
dest[i+j] = '\0';
return dest;
}
It can also be defined in terms of other string library functions:
char *
strcat(char *dest, const char *src)
{
strcpy(dest + strlen(dest), src);
return dest;
}
Bounds errors
[edit | edit source]strcat can be dangerous because if the string to be appended is too long to fit in the destination buffer, it will overwrite adjacent memory, invoking undefined behavior. Usually the program will simply cause a segmentation fault when this occurs, but a skilled attacker can use such a buffer overflow to break into a system (see computer security).
Bounds checking variants
[edit | edit source]To prevent buffer overflows, several alternatives for strcat have been used. All of them take an extra argument which encodes the length of the destination buffer and will not write past that buffer end. All of them can still result in buffer overflows if an incorrect length is provided.
strncat
[edit | edit source]char* strncat(char* dst, const char* src, size_t n);
The most common bounded variant, strncat, only appends a specified number of bytes, plus a NULL byte. This allows each concatenated string to use no more than its "share" of a buffer and was perhaps intended to make tables. It is poorly suited to the more common need of getting the prefix of the concatenated string that fits in the buffer. For this the proper value to pass for the count is bufferSize-strlen(buffer)-1. Common mistakes are to pass bufferSize, bufferSize-1, and bufferSize-strlen(buffer), all of which can still produce a buffer overflow.
strlcat
[edit | edit source]size_t strlcat(char* dst, const char* src, size_t size);
The strlcat function, created by OpenBSD developers Todd C. Miller and Theo de Raadt, is often regarded as a safer and more useful version of strncat. It takes the actual length of the buffer as an argument, and returns the number of bytes that would be needed allowing the caller to reallocate the buffer if possible. It has been ported to a number of operating systems, but notably rejected by glibc maintainers, who suggest that C programmers need to keep track of string length and that "using this function only leads to other errors."[1]
strcat_s
[edit | edit source]errno_t strcat_s(char* dst, rsize_t size, const char* src);
The strcat_s
function, proposed for standardisation in ISO/IEC TR 24731,[2][3] is supported by the Microsoft C Runtime Library.[4] and some other C libraries. It returns non-zero if the source string does not fit, and sets the buffer to the empty string (a disastrous result if the original string is not stored elsewhere or if the caller ignores the return result). It is also explicitly unsupported by some libraries, including the GLibc library.[5] Warning messages produced by Microsoft's compilers suggesting programmers change strcat and strncat to this function have been speculated by some to be a Microsoft attempt to lock developers to its platform.[6][7]
External links
[edit | edit source]- strcat(3) man page via OpenBSD
- C++ reference for
std::strcat
- [1]
strchr
[edit | edit source]Function name strchr() for C and C++
Syntax
include <string.h>
char *strchr(const char *s, int c);
Description
The strchr() function locates the first occurrence of c, cast to char, in the string pointed to by s. The terminating null character is considered a part of the string.
Parameters
s
Points to the string to be searched.
c Is the character to search for in string s.
Return Values
The strchr() function returns a pointer to the first occurrence of character c located within s. If character c does not occur in the string, strchr() returns a null pointer.
strcmp
[edit | edit source]In POSIX and in the programming language C, strcmp
is a function in the C standard library (declared in string.h
) that compares two C strings.
The prototype according ISO/IEC 9899:1999, 7.21.4.2
int strcmp(const char *s1, const char *s2);
strcmp
returns 0 when the strings are equal, a negative integer when s1
is less than s2
, or a positive integer if s1
is greater than s2
, according to the lexicographical order.
A variant of strcmp
exists called strncmp
that only compares the strings up to a certain offset.
Another variant, strcasecmp
, conforming to POSIX.1-2001, works like strcmp
, but is case-insensitive. Some systems instead provide this functionality with functions named stricmp
or strcmpi
. To compare a subset of both strings with case-insensitivity, various systems may provide strncasecmp
, strnicmp
or strncmpi
.
Example
[edit | edit source]#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main (int argc, char **argv)
{
int v;
if (argc < 3)
{
fprintf (stderr, "Compares two strings\nUsage: %s string1 string2\n",argv[0]);
return EXIT_FAILURE;
}
v = strcmp (argv[1], argv[2]);
if (v < 0)
printf ("'%s' is less than '%s'.\n", argv[1], argv[2]);
else if (v == 0)
printf ("'%s' equals '%s'.\n", argv[1], argv[2]);
else if (v > 0)
printf ("'%s' is greater than '%s'.\n", argv[1], argv[2]);
return 0;
}
The above code is a working sample that prints whether the first argument is less than, equal to or greater than the second.
A possible implementation is (P.J. Plauger, The Standard C Library, 1992):
int strcmp (const char * s1, const char * s2)
{
for(; *s1 == *s2; ++s1, ++s2)
if(*s1 == 0)
return 0;
return *(unsigned char *)s1 < *(unsigned char *)s2 ? -1 : 1;
}
However, most real-world implementations will have various optimization tricks to reduce the execution time of the function. One will notice, that strcmp not only returns -1, 0 and +1, but also other negative or positive values, resulting from optimizing away the branching introduced by the ?:
operator:
return *(const unsigned char *)s1 - *(const unsigned char *)s2;
External links
[edit | edit source]strcpy
[edit | edit source]The C programming language offers a library function called strcpy, defined in the string.h header file, that allows null-terminated memory blocks to be copied from one location to another. Since strings in C are not first-class data types and are implemented instead as contiguous blocks of bytes in memory, strcpy will effectively copy strings given two pointers to blocks of allocated memory.
The prototype of the function is:[8]
char *strcpy(char *destination, const char *source);
The argument order mimics that of an assignment: destination "=" source. The return value is destination
.
Usage and implementation
[edit | edit source]For example
char *str1 = "abcdefghijklmnop";
char *str2 = malloc(100); /* must be large enough to hold the string! */
strcpy(str2, str1); /* str2 is now "abcdefghijklmnop" */
str2[0] = 'A'; /* str2 is now "Abcdefghijklmnop" */
/* str1 is still "abcdefghijklmnop" */
In the second line memory is allocated to hold the copy of the string, then the string is copied from one memory block into the other, then the first letter of that copy is modified.
Although the simple assignment str2 = str1 might appear to do the same thing, it only copies the memory address of str1 into str2 but not the actual string. Both str1 and str2 would refer to the same memory block, and the allocated block that used to be pointed to by str2 would be lost. The assignment to str2[0] would either also modify str1, or it would cause an access violation (as modern compilers often place the string constants in read-only memory).
The strcpy function performs a copy by iterating over the individual characters of the string and copying them one by one. An explicit implementation of strcpy is:
char *strcpy(char *dest, const char *src)
{
unsigned i;
for (i=0; src[i] != '\0'; ++i)
dest[i] = src[i];
//Ensure trailing null byte is copied
dest[i]= '\0';
return dest;
}
A common compact implementation is:
char *strcpy(char *dest, const char *src)
{
char *save = dest;
while(*dest++ = *src++);
return save;
}
Modern versions provided by C libraries often copy far more than one byte at a time, relying on bit math to detect if the larger word has a null byte before writing it. Often a call compiles into an inline machine instruction specifically designed to do strcpy.
Unicode
[edit | edit source]strcpy will work for all common byte encodings of Unicode strings, including UTF-8. There is no need to actually know the encoding as long as the null byte is never used by it.
If Unicode is encoded in units larger than a byte, such as UTF-16, then a different function is needed, as null bytes will occur in parts of the larger code units. C99 defines the function wcscpy(), which will copy wchar_t-sized objects and stop at the first one with a zero value. This is not as useful as it appears, as different computer platforms disagree on how large a wchar_t is (some use 16 bits and some 32 bits).
Buffer overflows
[edit | edit source]strcpy can be dangerous because if the string to be copied is too long to fit in the destination buffer, it will overwrite adjacent memory, invoking undefined behavior. Usually the program will simply cause a segmentation fault when this occurs, but a skilled attacker can use buffer overflow to break into a system. To prevent buffer overflows, several alternatives for strcpy have been used. All of them take an extra argument which is the length of the destination buffer and will not write past that buffer end. All of them can still result in buffer overflows if an incorrect length is provided.
strncpy
[edit | edit source]char* strncpy(char* dst, const char* src, size_t size);
strncpy writes exactly the given number of bytes, either only copying the start of the string if it is too long, or adding zeros to the end of the copy to fill the buffer. It was introduced into the C library to deal with fixed-length name fields in structures such as directory entries. Despite its name it is not a bounded version of strcpy; it does not guarantee that the result is a null-terminated string. The name of the function is misleading because strncat and snprintf are respectively bounded versions of strcat and sprintf.
The assumption that the result is a null-terminated string leads to two problems. If the source string is too long, the result is not null-terminated, making data after the end of the buffer appear to be part of the string. And if the source string is much shorter than the buffer, considerable time will be wasted filling the rest of the buffer with null bytes.
An alternative from the standard C library that will always append one null byte is to use strncat with an initially empty string as the destination.
strlcpy
[edit | edit source]size_t strlcpy(char* dst, const char* src, size_t size);
The strlcpy function, created by OpenBSD developers Todd C. Miller and Theo de Raadt, is often regarded as a safer version of strncpy. It always adds a single null byte, and returns the number of bytes that would be needed, allowing the caller to reallocate the buffer if possible. It has been ported to a number of operating systems, but notably rejected by Ulrich Drepper, the glibc maintainer, who suggests that C programmers need to keep track of string length and that "using this function only leads to other errors."[1]
strcpy_s
[edit | edit source]errno_t strcpy_s(char* dst, rsize_t size, const char* src);
The strcpy_s
function, proposed for standardisation in ISO/IEC TR 24731,[9][10] is supported by the Microsoft C Runtime Library[11] and some other C libraries. It returns non-zero if the source string does not fit, and sets the buffer to the empty string (not the prefix!). It is also explicitly unsupported by some libraries, including the GNU C library.[12] Warning messages produced by Microsoft's compilers suggesting programmers change strcpy and strncpy to this function have been speculated by some to be a Microsoft attempt to lock developers to its platform.[13][14]
External links
[edit | edit source]- C++ reference for
std::strcpy
- : Copy strings – OpenBSD Library Functions Manual
strcspn
[edit | edit source]strcspn is the function from the C standard library (header file string.h).
It searches the string for certain set of characters.
The strcspn() function calculates the length of initial segment of string 1 which does not contain any character from string 2.
Return Value
[edit | edit source]This function returns the index of first character in string 1 that matches with any character of string 2.
Syntax
[edit | edit source]#include <string.h>
size_t strcspn( const char *str1, const char *str2 );
Example
[edit | edit source]#include <stdio.h>
#include <string.h>
int main(){
char s[20] = "wikireader007", t[11] = "0123456789";
printf("The first decimal digit of s is at position: %d.\n", '''strcspn'''(s, t));
return 0;
}
Output:
The first decimal digit of s is at position: 10
External links
[edit | edit source]strerror
[edit | edit source]The string-error function, strerror, is a C/C++ function which translates an error code, usually stored in the global variable errno, to a human-readable error message.
History
[edit | edit source]The strerror function is defined in IEEE Std 1003.1, also known as POSIX 1.
Reentrancy
[edit | edit source]The strerror function is not reentrant. For a reentrant version of the function, see strerror r.
Usage
[edit | edit source]Inclusion
[edit | edit source]#include <string.h>
Declaration
[edit | edit source]char* strerror(int errnum);
Semantics
[edit | edit source]The function generates and reports a C-style string, containing an error message derived from the error code passed in with errnum.
References
[edit | edit source]- strerror by OpenGroup
External links
[edit | edit source]strlen
[edit | edit source]In the C standard library, strlen is a string function that determines the length of a C character string.
Example usage
[edit | edit source]#include <stdio.h>
#include <string.h>
int main()
{
char *string = "Hello World";
printf("%lu\n", (unsigned long)strlen(string));
return 0;
}
This program will print the value 11, which is the length of the string "Hello World". Character strings are stored in an array of a data type called char. The end of a string is found by searching for the first null character in the array.
Note importantly that this length does *NOT* include the array entry for the trailing null byte required for the ending character of C strings. Thus, if you need to copy the C string, you need to allocate a space of strlen() + 1.
Implementation
[edit | edit source]FreeBSD 6.2 implements strlen like so:[15]
size_t strlen(const char * str)
{
const char *s;
for (s = str; *s; ++s) {}
return(s - str);
}
It is possible to write faster versions in C that examines full machine word rather than byte-by-byte. Hacker's Delight has given an algorithm that makes use of bitwise operations to detect if any of these bytes is nul ('\0'). The current FreeBSD implementation does this.[16]
Modern C compilers usually provide fast inline versions of strlen written in assembly, either using the bitwise operation technique or a special instruction provided by certain CISC processors. In addition, strlen of a quoted string constant is often optimized into a constant integer.
External links
[edit | edit source]- Linux Library Functions Manual : calculate the length of a string –
- C++ reference for
std::strlen
strrchr
[edit | edit source]strrchr is function in string.h.[17] It is mainly used to locate last occurrence of character in string, searching from the end. It returns a pointer to the last occurrence of character in the C string str. The terminating null-character is considered part of the C string. Therefore, it can also be located to retrieve a pointer to the end of a string.
Syntax
[edit | edit source]In C, this function is declared as:
char *strrchr ( const char *, int );
str is a C string. character is the character to be located. It is passed as its int promotion, but it is internally converted back to char.
Return value
[edit | edit source]A pointer to the last occurrence of character in str. If the value is not found, the function returns a null pointer.
Example
[edit | edit source]#include <stdio.h>
#include <string.h>
int main(void)
{
const char *str = "This is a sample string";
char *pch = strrchr(str, 's');
printf("Last occurrence of 's' found at %d\n", pch - str + 1);
return 0;
}
Output : Last occurrence of 's' found at 18.
See also
[edit | edit source]External links
[edit | edit source]strspn
[edit | edit source]strspn() function is used to find out the length of substring or length of maximum initial segment of the string pointed to by one string, let's say s1 containing all characters from another string pointed to by another string say s2., while strscpn() is used to discover the length of initial segment containing all elements not in the reject list.
Syntax
[edit | edit source]size_t strspn(const char *s, const char *accept);
size_t strcspn(const char *s, const char *reject);
Parameters
[edit | edit source]s1- It points to null-terminated string to be searched.
s2- It points to null-terminated set of characters.
strstr
[edit | edit source]strstr is a C standard library string function as defined in string.h. strstr() has the function signature char * strstr(const char *haystack, const char *needle); which returns a pointer to a character at the first index where needle is in haystack, or NULL if not present.[18]
The strcasestr() is a function much like strstr() except that it ignores the case of both needle and haystack. strcasestr() is a non-standard function while strstr() conforms to C89 and C99.[18]
Example
[edit | edit source]#include <stdio.h>
#include <string.h>
int main(void)
{
/* Define a pointer of type char, a string and the substring to be found*/
char *cptr;
char str[] = "Wikipedia, be bold";
char substr[] = "edia, b";
/* Find memory address where substr "edia, b" is found in str */
cptr = strstr(str, substr);
/* Print out the character at this memory address, i.e. 'e' */
printf("%c\n", *cptr);
/* Print out "edia, be bold" */
printf("%s\n", cptr);
return 0;
}
cptr is now a pointer to the sixth letter (e) in "wikipedia".
External links
[edit | edit source]strtok
[edit | edit source]strtok is one of the C-library function. The syntax of strtok () function is as follow: Syntax:
#include <string.h>
char *strtok( char *str1, const char *str2 );
Description: The strtok() function returns a pointer to the next "token" in str1, where str2 contains the delimiters that determine the token. strtok() returns NULL if no token is found. In order to convert a string to tokens, the first call to strtok() should have str1 point to the string to be tokenized. All calls after this should have str1 be NULL.
strxfrm
[edit | edit source]strxfrm
is a C Standard Library string function declared in string.h. It transforms string according to the current locale setting.
The prototype of this function is:
size_t strxfrm(char *str1 , const char *str2 , size_t num);
Parameters
[edit | edit source]str1
[edit | edit source]is the string which receives num characters of transformed string. If num is equal to zero then str1 contains only null character.
str2
[edit | edit source]is the string which is to be transformed.
num
[edit | edit source]is the maximum number of characters which to be copied into str1.
Description
[edit | edit source]strxfrm() function transforms str2 according to the current locale setting.For that LC_COLLATE category is used which is defined in locale.h. After transformation, the first num characters of the transformed string is copied into str1. strxfrm() function performs transformation in such a way that result of strcmp on two strings is the same as result of strcoll on two original strings.
Return Value
[edit | edit source]strxfrm() function returns length of transformed string excluding terminating null character.
Example usage
[edit | edit source]#include <stdio.h>
#include <string.h>
int main(void)
{
char str2[] = "Hello World";
char str1[];
printf("The length of str2 = %d\n",strxfrm(str1, str2, 4));
printf("The content of str1 = %s\n", str1[]);
printf("The content of str2 = %s\n", str2[]);
return 0;
}
Output
[edit | edit source]The length of str2 = 11
The content of str1 = Hell
The content of str2 = Hello World
External links
[edit | edit source]- Linux Library Functions Manual –
- C++ reference for
std::strxfrm
References
[edit | edit source]- ↑ a b libc-alpha mailing list, selected messages from 8 Aug 2000 thread: 53, 60, 61
- ↑ ISO/IEC. ISO/IEC WDTR 24731 Specification for Secure C Library Functions. International Organization for Standardization. Retrieved 2008-04-23.
- ↑ Plakosh, Daniel. "strcpy_s() and strcat_s()". Pearson Education, Inc. Retrieved 2006-08-12.
- ↑ Microsoft. "Security Enhancements in the CRT". MSDN. Retrieved 2008-09-16.
- ↑ "Re: Implementing "Extensions to the C Library" (ISO/IEC WG14 N1172)".
- ↑ Danny Kalev. "They're at it again". InformIT.
- ↑ "Security Enhanced CRT, Safer Than Standard Library?".
- ↑ ISO/IEC 9899:1999 specification, p. 326, § 7.21.2.3
- ↑ ISO/IEC. ISO/IEC WDTR 24731 Specification for Secure C Library Functions. International Organization for Standardization. Retrieved 2008-04-23.
- ↑ Plakosh, Daniel. "strcpy_s() and strcat_s()". Pearson Education, Inc. Retrieved 2006-08-12.
- ↑ Microsoft. "Security Enhancements in the CRT". MSDN. Retrieved 2008-09-16.
- ↑ "Re: Implementing "Extensions to the C Library" (ISO/IEC WG14 N1172)".
- ↑ Danny Kalev. "They're at it again". InformIT.
- ↑ "Security Enhanced CRT, Safer Than Standard Library?".
- ↑ "strlen.c Revision 1.4". FreeBSD. 2002-03-21. Retrieved 2009-03-04.
- ↑ "Contents of /stable/10/lib/libc/string/strlen.c". FreeBSD. 2013-10-10.
- ↑ ISO/IEC 9899:1999 specification (PDF). p. 343, § 7.12.4.3.
- ↑ a b strcasestr(3)