In this assignment, you'll create your own library of string functions.
You'll have the opportunity to practice manipulating strings and
managing memory. Additionally, you'll learn the role of header and
library files.
You may not call functions in string.h but you can use other code in the
Standard C Library.
Functions to Include in the Library
Implement each of the following functions. Be sure that any string that
you create or modify is in fact a string, i.e., an array of char terminated
with the null character, '\0'.
Additionally, you should write a driver which tests each of these
functions on real data.
• [5 points] void remove_line_end(char *s)
removes any trailing newline characters from s if they exist
• [5 points] int index_of(char *h, char *n)
returns the index of the first occurence of n in the string h or -1 if it
isn't found.
• [5 points] char *address_of(char *h, char *n)
returns a pointer to the first occurence of n in the string h or NULL
if it isn't found
• [4 points] is_empty(char *s) returns 1 if s is NULL, consists of only the null character ('') or only
whitespace. returns 0 otherwise.
• [5 points] str_zip(char *s1, char *s2)
Returns a new string consisting of all of the characters of s1 and
s2 interleaved with each other. For example, if s1 is "Spongebob"
and s2 is "Patrick", the function returns the string
"SPpaotnrgiecbkob"
• [5 points] char *acronymer(char *s)
returns a new string which is an acronym of the words in s. For
example, if s is the string Royal Australian Air Force, the
function returns the string RAAF
• [5 points] int strcmp_ign_case(char *s1, char *s2)
Compares s1 and s2 ignoring case. Returns a positive number if
s1 would appear after s2 in the dictionary, a negative number if it
would appear before s2, or 0 if the two are equal.
• [3 points] void take(char *s, int n)
Modifies s so that it is truncated to n characters. If n is ≥ the
length of s, the original string is unmodified. For example if we call
take("Brubeck" 5), when the function finishes, the original string
becomes "Brube"
• [3 points] void take_last(char *s, int n)
Modifies s so that it consists of only its last n characters. If n is ≥
the length of s, the original string is unmodified. For example if we
call take_last("Brubeck" 5), when the function finishes, the original
string becomes "ubeck"
• [5 points] dedup(char *s)
returns a new string based on s, but without any duplicate
characters. For example, if s is the string,"There's always
money in the banana stand.", the function returns the string "Ther's alwymonitbd.". It is up to the caller to free the
memory allocated by the function.
• [5 points] pad(char *s, int d)
returns a new string consisting of all of the letters of s, but padded
with spaces at the end so that the total length of the returned
string is an even multiple of d. If the length of s is already an even
multiple of d, the function returns a copy of s. The function
returns NULL on failure or if s is NULL. Otherwise, it returns the
new string. It is up to the caller to free any memory allocated by
the function.
• [5 points] begins_with_ignore_case(char *s, char
*pre)
returns 1 if pre is a prefix of s ignoring case or 0 otherwise.
• [5 points] ends_with_ignore_case(char *s, char *suff)
returns 1 if suff is a suffix of s ignoring case or 0 otherwise.
• [5 points] char *repeat(char *s, int x, char sep)
Returns a new string consisting of the characters in s repeated x
times, with the character sep in between. For example, if s is the
string all right, x is 3, and sep is , the function returns the
new string all right,all right,all right. If s is NULL,
the function returns NULL. It is up to the caller to free any memory
allocated by the function.
• [5 points] intersect(char *s1, char *s2)
returns a new string consisting of the intersection of the
characters in s1 and s2. If s1 and s2 have no characters in
common, the function returns NULL. It is up to the caller to free
any memory allocated byintersect
• [5 points] char *str_connect(char **strs, int n,
char c) Returns a string consisting of the first n strings in strs with the
character c used as a separator. For example, if strs contains the
strings {"Washington", "Adams", "Jefferson"} and c
is '+', the function returns the
string "Washington+Adams+Jefferson"
• [5 points] char **str_chop(char *s, char c)
Returns an array of 3 strings consisting of the characters in s: The
first string consists of the characters of s before c, the second
string consists solely of c itself, and the third string consists of the
letters that follow c in s. (Remember that all of these must be C
strings.) If c is not found in s, return an array of 3 NULL pointers.
For example, if s is "Kanye+Tay Tay" and c is '+', it
returns {"Kanye", "+", "Tay Tay"}
• [5 points] char **str_chop_all(char *s, char c)
Returns an array of string consisting of the characters in s split
into tokens based on the delimeter c, followed by a NULL pointer.
For example, if s is "I am ready for a nice vacation" and c is ' ', it
returns {"I", "am", "ready", "for", "a", "nice",
"vacation", NULL}
Pointer vs Array Notation
Though it's not a formal requirement, it is suggested that you try to do
some of these using pointer notation instead of array notation.
For example, we could write a string length function as:
int strlen(char s[])
{
int i=0;
while (s[i]!='\0')
i++;
return i;
}
We could also write:
int strlen(char *s)
{
char *t=s;
while (*t!='\0')
t++;
return t-s;
}
Arch
Remember that we keep our function declarations in a header file, a .h
file. Each of your functions should be in a separate .c file (e.g., you'll
have remove_line_end in a file called remove_line_end.c file, the
index_of in a file called index_of.c .
So you'll have:
• A collection of .c files:
o one for each function in the library
o a test program
• A single .h file, which includes the declarations for each of the
functions in the library. This is #included by each of the .c files.
This is just as we did with our little practice math library from class.
Recall that the organization was:
Generating the Library (5 points)
To create the library file, we use the ar command. The syntax is:
ar rcs NameOfTheLibraryFileToCreate
listOfFilesToIncludeInTheLibrary
Don't forget that you're including the binary files, i.e., the .o files, not
your c source files.
So suppose you have all of your function .o files in a single directory,
and you'd like to call your library file libstr2107.a, you'd type:
ar rcs libstr2107.a *.o
To double-check, you can try:
ar t libstr2107.a
if you see a list of your .o files, you've done it right. Using the library with your driver (10
points)
Write a very simple, basic program to test your functions. Suppose that
it's in a file called strtester.c. To compile it, using your new string library,
libstr2107.a, type:
gcc -o strtester strtester.c -
LdirectoryWhereYouPutTheLibrary -lstr2107
so, if libstr2107.a is in your current directory, you'd type:
gcc -o strtester strtester.c -L. -lstr2107
Note that at the end of the line, it's -l (lower case l, not the number 1),
and it's just str2107 not lib2107.a
Deliverables
Please upload a single zip/rar/tar file containing all of your .c files (string
functions and test file), your .h file and your string library file (your .a file).
In order to help the TA keep track of everyone's files, please include
your name in the name of the compressed file.
Some Tips, Reminders, etc.
Include the .h file in each function .c file.
You may not use any function in string.h.
You may not change the name or return type of each function, or there
will be error when the TA tests your functions with his own test file.
The name of the library file has a prefix “lib”, which means the actual
name is after that. In the previous example, the name libstr2107.a, so
when we need to use that library, it’s –lstr2107, not –llibstr2107.
Don’t forget to free the pointer in your main function, or the variables
malloced in your functions will stay in your system unless you reboot the
computer. gdb
We spent time with GDB for a reason. Use it.
stack memory vs. heap memory
All of this about "The caller is responsible for freeing the
memory ..." should tell you that inside the function you're going to be
allocating the memory for what's returned. Remember to
use malloc (or calloc) and don't return a pointer to local stack
memory. For example, something like this:
char *func()
{
char str[50];
...
return str;
}
might compile, but it's wrong. str is allocated on func's stack, and the
memory used by str will be likely be given to some other function as the
program continues to run. Instead, you'd do something like:
char *func()
{
char *str;
...
if ((str=malloc(50))==NULL) /* allocate 50 bytes and check */
return NULL; * to see if they were successfully */
* allocated */
...
return str;
}
As we've discussed in class, this is something that you didn't have to
worry about in Java, because Java arrays are allocated on the heap.
Remember that the equivalent Java would be:
char[] func()
{
char str[] = new str[50];
...
return str; }
and as we also said in class, Java's new operator serves the same
purpose as C's malloc( )
string literals
Here's another thing that you didn't need to worry about in Java that
might cause some headaches in C. Remember that these two
declarations are not exactly the same:
char *str01 = "What time does class end?";
char str02[] = "What time does class end?";
They both look the same when you print them, but if you try to
modify str01, you'll get in trouble. The memory thatstr01 points to is
read-only. str02 isn't, but it's only as much space as is required to hold
the letters 'W', 'h', 'a', ..., and the null character ('\0').
None of this is likely to be a big deal in any of the functions you write,
but it might come up when you're trying to test a function. For example,
something like:
char *str01 = " Where's the remote control? ";
...
strip(str01);
will give you a segmentation fault, but this doesn't necessarily mean that
there's a problem in the strip function.
For more information, please take a look at K&R section 5.5.