C語言字串切割完整指南｜標準函式・自製函式・多位元組支援

1 1. 前言
2 2. C言語的字串是什麼？基本概念與終端字元的重要性
3 3. C語言中切出子字串的方法【標準函式庫篇】
4 4. C語言中切割子字串的方法【自製函式篇】
5 5. 依照字元編碼的字串切取方法
6 6. C語言中分割字串的方法
7 7. 應用範例: 抽取特定字元前後的方法
8 8. 總結

1. 前言

C語言中對字串的操作是學習程式設計時重要的技能之一。特別是，字串切取（子字串抽取）在資料處理與格式轉換時常被使用。

本文將針對C語言中字串切取的方法進行詳細說明，包含使用標準函式庫函式、製作自訂函式、支援多位元組字元（日本語）、字串分割方法等。另將介紹應用範例與錯誤處理，請務必閱讀至最後。

在本文中可學到的內容

閱讀本文可掌握以下技能。

C 語言的字串基本概念與終止字元的角色
strncpystrchr標準函式庫函式所做的子字串切割
自製函式
考慮多字節文字（日本語）的
strtok字符串分割方法
取得特定字元前後的方法

即使是初學者也能輕鬆理解，我們將結合程式碼範例說明。

為何在 C 語言中字串切取很重要？

C語言將字串視為「陣列（char 型陣列）」處理，無法像其他高階語言（如 Python、JavaScript 等）輕易取得子字串。因此，在以下情境中選擇適當的方法相當重要。

1. 輸入資料的處理

例如，在解析日誌資料或 CSV 檔等資料時，需要抽取特定項目。

2. 搜尋特定關鍵字

在字串中尋找特定關鍵字，並取得其前後資訊，對於搜尋功能與資料抽取是必不可少的。

3. 提升程式的安全性

strncpy 透過適當使用此類函式，可防止 緩衝區溢位（寫入超過緩衝區大小的資料）。這對於避免安全風險相當重要。

本文結構

本文將依以下流程說明。

C 語言的字串是什麼？基本概念與終止字元的重要性
C 語言中切割子字串的方法【標準函式庫編】
C語言中切割子字串的方法【自製函式編】
按文字編碼切割字串的方法
C 語言分割字串的方法
應用例: 特定字元前後提取方法
總結
FAQ

那麼，首先讓我們詳細看看「C語言的字串是什麼？基本概念與終止字元的重要性」。

2. C言語的字串是什麼？基本概念與終端字元的重要性

2.1 C言語字串的基本概念

字串是「char的陣列」

在C語言中，字串被視為字元的陣列（char型的陣列）。例如，以下程式碼是字串定義與顯示的基本範例。

#include <stdio.h>

int main() {
    char str[] = "Hello, World!"; // Define a string literal as an array
    printf("%s ", str); // Output the string
    return 0;
}

在此程式碼中，字串 "Hello, World!" 被作為 char 型的陣列儲存，並透過 printf("%s\n", str); 輸出。

字串的內部結構

字串 "Hello" 在記憶體中會如下儲存。

索引	0	1	2	3	4	5
文字	H	e	l	l	o	\0

在C語言中，表示字串結尾的特殊字元（空字元 '\0'）會自動在最後加入，因此字串長度為 「實際字元數 + 1」。

2.2 終端字元（空字元 `'什麼是空字元？ '`）的重要性

缺少空字元時的問題

空字元（'\0'）是表示字串結尾的特殊字元。為了正確處理C語言的字串，需要了解此空字元的存在。

#include <stdio.h>

int main() {
    char str[6] = {'H', 'e', 'l', 'l', 'o', '\0'}; // Explicitly specify the null terminator
    printf("%s ", str);                            // Display correctly
    return 0;
}

在上述程式碼中，若缺少 '\0'，則無法辨識 "Hello" 的結尾，可能會導致 意外的行為發生。

2.3 正確的字串定義方法

如以下所示，若忘記 終端文字，可能會引發記憶體異常運作。

#include <stdio.h>

int main() {
    char str[5] = {'H', 'e', 'l', 'l', 'o'}; // Does not include the null terminator
    printf("%s ", str);                      // May cause unexpected behavior
    return 0;
}

錯誤的原因

printf("%s\n", str); 空字元 '\0' 前持續輸出文字
如果不存在，可能會輸出記憶體中的其他資料。

方法① 使用字串常值

方法② 明確定義陣列

char str[] = "Hello";

使用此方法時，C編譯器會自動加入空字元 '\0'，因此不需要特別處理。

2.4 確認字串大小的方法

若手動包含 '\0' 進行定義，則可如下寫法。

char str[6] = {'H', 'e', 'l', 'l', 'o', '\0'};

文字數的大小指定，最後加入是重要的。
如果中忘記放入，就會發生意外的行為。

`strlen` 的運作

要取得字串長度（字元數），請使用 strlen 函式。

#include <stdio.h>
#include <string.h>

int main() {
    char str[] = "Hello";
    printf("Length of the string: %lu\n", strlen(str)); // Outputs 5 (does not include the null terminator)
    return 0;
}

2.5 總結

strlenヌル文字 '\0' 為止的字元數
sizeof(str)

3. C語言中切出子字串的方法【標準函式庫篇】

C 語言的字串以 char 陣列表示
終端文字（空字元 '\0'）表示字串結束，必須包含
要取得字串長度 strlen 使用
若未以適當方式定義字串，可能會發生意外錯誤

3.1 `strncpy` 使用取得子字串

在 C 語言中切出子字串，可以利用標準函式庫的方法。本節將說明使用 strncpy 與 strchr 等 標準函式庫函式，以取得字串的部分內容的方法 。

`strncpy` 的基本語法

strncpy 是將字串的一部份複製到另一個緩衝區的函式。

基本使用範例

char *strncpy(char *dest, const char *src, size_t n);

dest
src
n'\0'

`strncpy` 的注意事項

#include <stdio.h>
#include <string.h>

int main() {
    char src[] = "Hello, World!";
    char dest[6];  // Buffer to store the substring

    strncpy(dest, src, 5); // Copy the first 5 characters "Hello"
    dest[5] = '\0';        // Manually add the null terminator

    printf("Substring: %s\n", dest);  // Output "Hello"

    return 0;
}

3.2 `strncpy_s` 使用安全的字串複製

空字元 '\0' 需要手動添加 會複製最多個字元，但不會自動添加，因此必須明確地。
請注意緩衝區溢位 的大小若大於，可能會寫入緩衝區之外。

`strncpy_s` 的基本語法

strncpy_s 是 strncpy 加強安全性的版本，可防止緩衝區溢位。

使用範例

errno_t strncpy_s(char *dest, rsize_t destsz, const char *src, rsize_t n);

dest
destszdest
src
n

`strncpy_s` 的優點

#include <stdio.h>
#include <string.h>

int main() {
    char src[] = "Hello, World!";
    char dest[6];

    if (strncpy_s(dest, sizeof(dest), src, 5) == 0) {
        dest[5] = '\0';  // Add null terminator just in case
        printf("Substring: %s\n", dest);
    } else {
        printf("Copy error\n");
    }

    return 0;
}

3.3 `strchr` 使用至特定字元的切出

為了指定緩衝區大小（），可以安全地複製。
destszn

但是，strncpy_s 是在 C11 標準中加入的，需注意在部分環境可能無法使用。

`strchr` 的基本語法

strchr 使用後，可找到特定字元的位置，並取得至該位置的字串。

使用範例

char *strchr(const char *str, int c);

str
cchar

要點

#include <stdio.h>
#include <string.h>

int main() {
    char str[] = "Hello, World!";
    char *pos = strchr(str, ','); // Find the position of ','

    if (pos != NULL) {
        int length = pos - str; // Calculate the number of characters up to ','
        char result[20];

        strncpy(result, str, length);
        result[length] = '\0'; // Add the null terminator

        printf("Substring: %s\n", result);  // Output "Hello"
    }

    return 0;
}

3.4 `strstr` 使用關鍵字搜尋與切出

strchr返回c 最先找到的地址
pos - strstrncpy

`strstr` 的基本語法

strstr 用於搜尋子字串，並取得其之後的字串，非常方便。

使用範例

char *strstr(const char *haystack, const char *needle);

haystack
needle

要點

#include <stdio.h>
#include <string.h>

int main() {
    char str[] = "Hello, World!";
    char *pos = strstr(str, "World"); // Search for the position of "World"

    if (pos != NULL) {
        printf("Found substring: %s\n", pos);
    } else {
        printf("Substring not found.\n");
    }

    return 0;
}

3.5 小結

strstrneedle
NULLneedlehaystack

4. C語言中切割子字串的方法【自製函式篇】

strncpy 使用 strncpy 時，可以安全地複製子字串，但必須手動添加 null 字元。
strncpy_s 可以指定 destsz，提高安全性。
strchr 若使用，便能取得到指定字元為止的子字串。
strstr 如果使用，特定關鍵字的位置可以取得，並從那裡切割後續。

透過活用標準函式庫，可在 C 語言中簡潔且安全地實作字串處理。

4.1 製作自製函式的好處

如果利用標準函式庫，基本的子字串切割是可能的，但視情況可能需要更彈性的方式。因此，本節將說明使用自製函式的子字串切割。

4.2 基本的子字串抽取函式

使用標準函式庫可以進行子字串的複製與搜尋，但會有以下問題。

strncpy 不會自動加入空字元 '\0'
strchr 和 strstr 只能進行部分搜尋
更靈活的字串操作更困難

因此，根據特定用途客製化的自製函式是有效的。

函式規格

首先，建立從指定位置切割字串的基本函式。

實作程式碼

引数
const char *source
int start（開始位置）
int length
char *dest
処理内容
startlengthdest
'\0'

重點

#include <stdio.h>
#include <string.h>

void substring(const char *source, int start, int length, char *dest) {
    int i;
    for (i = 0; i < length && source[start + i] != '\0'; i++) {
        dest[i] = source[start + i];
    }
    dest[i] = '\0'; // Add null terminator
}

int main() {
    char text[] = "Hello, World!";
    char result[10];

    substring(text, 7, 5, result); // Extract "World"
    printf("Substring: %s\n", result);

    return 0;
}

4.3 `malloc` 使用的動態子字串取得

forlength
'\0'
dest[i] = '\0';必須將空字元放在結尾

函式規格

在上述函式中，需要事先確保dest的大小。然而，如果能動態確保所需大小，函式會更通用。

實作程式碼

必要的記憶體以方式確保
startlength
呼叫方需要

重點

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *substring_dynamic(const char *source, int start, int length) {
    char *dest = (char *)malloc(length + 1); // +1 for the null terminator
    if (dest == NULL) {
        return NULL; // Memory allocation failed
    }

    int i;
    for (i = 0; i < length && source[start + i] != '\0'; i++) {
        dest[i] = source[start + i];
    }
    dest[i] = '\0';

    return dest;
}

int main() {
    char text[] = "Hello, World!";
    char *result = substring_dynamic(text, 7, 5);

    if (result != NULL) {
        printf("Substring: %s\n", result);
        free(result); // Free allocated memory
    } else {
        printf("Memory allocation failed.\n");
    }

    return 0;
}

4.4 多位元組字元（日文）支援

malloc動態分配記憶體
使用後，必須在釋放記憶體。

考慮多位元組字元的實作

在處理日文（UTF-8 等多位元組字元）時，1 個字元不一定是 1位元組，因此單純的substring函式無法正確運作。

實作程式碼（UTF-8 支援）

使用，將多字節字串轉換為寬字串（）
wcsncpy
wcstombs

重點

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <wchar.h>
#include <locale.h>

void substring_utf8(const char *source, int start, int length, char *dest) {
    setlocale(LC_ALL, ""); // Set the locale

    wchar_t wsource[256];
    mbstowcs(wsource, source, 256); // Convert UTF-8 string to wide-character string

    wchar_t wresult[256];
    wcsncpy(wresult, wsource + start, length); // Extract substring in wide characters
    wresult[length] = L'\0';

    wcstombs(dest, wresult, 256); // Convert back to multibyte string
}

int main() {
    char text[] = "こんにちは、世界！"; // UTF-8 string
    char result[20];

    substring_utf8(text, 5, 3, result); // Extract "世界"
    printf("Substring: %s\n", result);

    return 0;
}

4.5 小結

setlocale(LC_ALL, "");
mbstowcs
wcsncpywcstombs

5. 依照字元編碼的字串切取方法

substring 如果自己製作，可以靈活地取得子字串。
動態記憶體分配 (malloc) 使用時，可取得可變大小的子字串。
處理多字節文字（日語）時，mbstowcs / wcstombs 請使用。

當標準函式庫的strncpy或strchr難以應對時，透過製作自製函式，可使C言語的字串處理更為強大。

5.1 ASCII（1位元組字元）的情況

在 C 語言中，如果不注意字元編碼的差異，字串切取處理可能無法正確運作特別是，處理像日文這樣的多位元組字元（UTF-8、Shift_JIS、EUC-JP 等）時，因為 1字元-1位元組，單純使用 strncpy 或 substring 函式無法適當處理。

本節中，將詳細解說依照字元編碼的字串切取方法。

基本的子字串取得

實作範例

ASCII 文字是 1字元-1位元組，因此可以使用 strncpy 或 substring 函式輕鬆處理。

5.2 UTF-8（多位元組字元）的情況

#include <stdio.h>
#include <string.h>

void substring_ascii(const char *source, int start, int length, char *dest) {
    strncpy(dest, source + start, length);
    dest[length] = '\0'; // Add null terminator
}

int main() {
    char text[] = "Hello, World!";
    char result[6];

    substring_ascii(text, 7, 5, result); // Extract "World"
    printf("Substring: %s\n", result);

    return 0;
}

要點

ASCII 文字（英數字僅）時 strncpy 足以應對
'\0'必須加入（空字元）

UTF-8 的特性

正確的處理方法

在 UTF-8 中，1字元的位元組數為 1-4位元組，因為可變長度，若直接使用 strncpy 可能會在字元中途被截斷。

支援 UTF-8 的子字串取得

在 C 語言中，為了安全處理 UTF-8，建議使用 mbstowcs 轉換為寬字元字串（wchar_t），再取得子字串的方法。

5.3 Shift_JIS（多位元組字元）的情況

#include <stdio.h>
#include <stdlib.h>
#include <wchar.h>
#include <locale.h>

void substring_utf8(const char *source, int start, int length, char *dest) {
    setlocale(LC_ALL, ""); // Set the locale

    wchar_t wsource[256];
    mbstowcs(wsource, source, 256); // Convert multibyte string to wide-character string

    wchar_t wresult[256];
    wcsncpy(wresult, wsource + start, length); // Get the substring
    wresult[length] = L'\0';

    wcstombs(dest, wresult, 256); // Convert wide-character string back to multibyte
}

int main() {
    char text[] = "こんにちは、世界！"; // UTF-8 string
    char result[20];

    substring_utf8(text, 5, 3, result); // Extract "世界"
    printf("Substring: %s\n", result);

    return 0;
}

要點

setlocale(LC_ALL, "");
mbstowcswchar_twcsncpy
wcstombs

Shift_JIS 的特性

支援 Shift_JIS 的子字串取得

在 Shift_JIS 中，1字元可能是1位元組或2位元組，因此直接使用 strncpy 會導致文字亂碼。

Shift_JIS 的實作

即使是 Shift_JIS，也建議使用 轉換為寬字元字串後處理的方法。

5.4 EUC-JP（多位元組字元）的情況

#include <stdio.h>
#include <stdlib.h>
#include <wchar.h>
#include <locale.h>

void substring_sjis(const char *source, int start, int length, char *dest) {
    setlocale(LC_ALL, "Japanese"); // Set locale to handle Shift_JIS

    wchar_t wsource[256];
    mbstowcs(wsource, source, 256); // Convert multibyte string (Shift_JIS) to wide-character string

    wchar_t wresult[256];
    wcsncpy(wresult, wsource + start, length); // Extract substring
    wresult[length] = L'\0';

    wcstombs(dest, wresult, 256); // Convert wide-character string back to multibyte (Shift_JIS)
}

int main() {
    char text[] = "こんにちは、世界！"; // Shift_JIS string (depending on environment)
    char result[20];

    substring_sjis(text, 5, 3, result); // Extract "世界"
    printf("Substring: %s\n", result);

    return 0;
}

要點

要正確處理 Shift_JIS 需要設定。
mbstowcswcstombs

EUC-JP 的特性

支援 EUC-JP 的子字串取得

EUC-JP 與 Shift_JIS 類似，因為每字元的位元組數不同，需要使用寬字元進行轉換處理。

5.5 總結

#include <stdio.h>
#include <stdlib.h>
#include <wchar.h>
#include <locale.h>

void substring_eucjp(const char *source, int start, int length, char *dest) {
    setlocale(LC_ALL, "ja_JP.eucJP"); // Set locale to handle EUC-JP

    wchar_t wsource[256];
    mbstowcs(wsource, source, 256); // Convert multibyte string (EUC-JP) to wide-character string

    wchar_t wresult[256];
    wcsncpy(wresult, wsource + start, length); // Extract substring
    wresult[length] = L'\0';

    wcstombs(dest, wresult, 256); // Convert wide-character string back to multibyte (EUC-JP)
}

int main() {
    char text[] = "こんにちは、世界！"; // EUC-JP string (depending on environment)
    char result[20];

    substring_eucjp(text, 5, 3, result); // Extract "世界"
    printf("Substring: %s\n", result);

    return 0;
}

要點

setlocale(LC_ALL, "ja_JP.eucJP");
mbstowcswcstombs

6. C語言中分割字串的方法

文字編碼	位元組數	推薦的處理方法
ASCII	1位元組	`strncpy`
UTF-8	1-4位元組	`mbstowcswcstombs`
Shift_JIS	1 或 2 位元組	`mbstowcswcstombs`
EUC-JP	1 或 2 位元組	`mbstowcswcstombs`

若僅為 ASCII 文字 strncpy即可
UTF-8, Shift_JIS, EUC-JP 的情況下，請使用 mbstowcs / wcstombs 使用
根據環境適當設定 setlocale(LC_ALL, \"...\"); 適當設定

6.1 `strtok` 使用的字串分割

字串分割的處理在、CSV資料的解析、命令列參數處理、日誌資料的解析等多個情境中是必要的。C語言中，有使用 strtok 或 strtok_r 等標準函式庫函式的方法，也可以自行撰寫函式。

本節將詳細說明使用特定分隔字元分割字串的方法。

基本語法

strtok 是一個在 指定的分隔字元（delimiter）分割字串 的函式。

使用例：逗號 `,` 進行字串分割

char *strtok(char *str, const char *delim);

str
delim
回傳值
注意點strtok'\0'

執行結果

#include <stdio.h>
#include <string.h>


int main() {
    char str[] = "apple,banana,orange,grape"; // String to be split
    char *token = strtok(str, ",");            // Get the first token

    while (token != NULL) {
        printf("Token: %s\n", token);
        token = strtok(NULL, ",");             // Get the next token
    }

    return 0;
}

`strtok` 的注意事項

token: apple
token: banana
token: orange
token: grape

6.2 `strtok_r` 使用的執行緒安全字串分割

更改原始字串

strtok区切り文字を '\0' に書き換える

非執行緒安全

strtok因為這樣，

基本語法

strtok_r 是 strtok 的執行緒安全版，為了將 狀態保存至 saveptr，在多執行緒環境中也能安全使用。

使用例：空格進行字串分割

char *strtok_r(char *str, const char *delim, char **saveptr);

str
delim
saveptr

`strtok_r` 的優點

#include <stdio.h>
#include <string.h>

int main() {
    char str[] = "Hello World from C"; // String to be split
    char *token;
    char *saveptr; // Pointer to store internal state

    token = strtok_r(str, " ", &saveptr); // Get the first token
    while (token != NULL) {
        printf("Token: %s\n", token);
        token = strtok_r(NULL, " ", &saveptr); // Get the next token
    }

    return 0;
}

6.3 使用自製函式分割字串（不使用 `strtok` 的方法）

スレッドセーフ
能同時處理多個字串

自製函式的規格

strtok 會修改原始字串，因此也可以自行撰寫不改變原字串而分割字串的函式。

實作程式碼

入力
const char *source
const char delim
char tokens[][50]
処理
source
delimtokens

執行結果

#include <stdio.h>
#include <string.h>

void split_string(const char *source, char delim, char tokens[][50], int *count) {
    int i = 0, j = 0, token_index = 0;

    while (source[i] != '\0') {
        if (source[i] == delim) {
            tokens[token_index][j] = '\0';
            token_index++;
            j = 0;
        } else {
            tokens[token_index][j] = source[i];
            j++;
        }
        i++;
    }
    tokens[token_index][j] = '\0';
    *count = token_index + 1;
}

int main() {
    char text[] = "dog,cat,bird,fish";
    char tokens[10][50]; // Can store up to 10 words
    int count;

    split_string(text, ',', tokens, &count);

    for (int i = 0; i < count; i++) {
        printf("Token: %s\n", tokens[i]);
    }

    return 0;
}

重點

Token: dog
Token: cat
Token: bird
Token: fish

6.4 字串分割的應用（CSV 資料的處理）

source
tokens

CSV 資料解析範例

可以使用 strtok 解析 CSV（逗號分隔）資料。

執行結果

#include <stdio.h>
#include <string.h>

int main() {
    char csv[] = "Alice,24,Female\nBob,30,Male\nCharlie,28,Male"; // CSV data
    char *line = strtok(csv, "\n"); // Process line by line

    while (line != NULL) {
        char *name = strtok(line, ",");
        char *age = strtok(NULL, ",");
        char *gender = strtok(NULL, ",");

        printf("Name: %s, Age: %s, Gender: %s\n", name, age, gender);

        line = strtok(NULL, "\n");
    }

    return 0;
}

6.5 總結

Name: Alice, Age: 24, Gender: Female
Name: Bob, Age: 30, Gender: Male
Name: Charlie, Age: 28, Gender: Male

結論

方法	優點	缺點
`strtok`	可輕鬆分割	更改原始字串
`strtok_r`	執行緒安全	使用方法有點複雜
自作関数	不要更改原始字串	程式碼變長
CSV解析	方便於資料處理	`strtok` 注意其限制

7. 應用範例: 抽取特定字元前後的方法

若要簡單分割 strtok
若使用多執行緒 strtok_r
若不想改變原始，請使用自訂函式
也可應用於 CSV 數據分析

在下一節，我們將詳細說明「應用範例: 抽取特定字元前後的方法」。

7.1 `strchr` 使用取得特定字元前的字串

在字串處理時，特定的字元或關鍵字前後的抽取操作常常是必要的。例如，以下的案例可以被考慮。

從 URL 取得僅域名部分
從檔案路徑提取檔案名稱
取得特定標籤或符號前後的字串

在 C 語言中，透過使用 strchr 與 strstr，可以實現此類處理。另外，若需要更彈性的處理，自行撰寫函式也是有效的方法。

基本構文

strchr 使用時，特定的字元（首次出現的）位置可以被確定。

使用範例: 從檔案路徑取得檔名

char *strchr(const char *str, int c);

str
cchar

strchr 若找到 c，則回傳其位址。

執行結果

#include <stdio.h>
#include <string.h>

void get_filename(const char *path, char *filename) {
    char *pos = strrchr(path, '/'); // Search for the last '/'

    if (pos != NULL) {
        strcpy(filename, pos + 1); // Copy from the character after '/'
    } else {
        strcpy(filename, path); // If no '/', copy the whole path
    }
}

int main() {
    char path[] = "/home/user/documents/report.txt";
    char filename[50];

    get_filename(path, filename);
    printf("Filename: %s\n", filename);

    return 0;
}

重點

Filename: report.txt

7.2 `strstr` 使用取得特定關鍵字之後的字串

strrchr最後出現的特定字符（/）的位置可以取得
pos + 1只取得檔案名稱

基本構文

strstr 使用時，特定的字串（關鍵字）搜尋，並取得該位置之後的字串可以。

使用範例: 從 URL 取得網域

char *strstr(const char *haystack, const char *needle);

haystack
needle

strstr 若找到 needle，則回傳其位置的位址。

執行結果

#include <stdio.h>
#include <string.h>

void get_domain(const char *url, char *domain) {
    char *pos = strstr(url, "://"); // Search for the position of "://"

    if (pos != NULL) {
        strcpy(domain, pos + 3); // Copy from the character after "://"
    } else {
        strcpy(domain, url); // If "://" is not found, copy the entire string
    }
}

int main() {
    char url[] = "https://www.example.com/page.html";
    char domain[50];

    get_domain(url, domain);
    printf("Domain part: %s\n", domain);

    return 0;
}

重點

Domain part: www.example.com/page.html

7.3 `strchr` 使用分割特定字元前後的部分

strstr"https://""http://""//"
pos + 3://

使用範例: 從電子郵件地址分離使用者名稱與網域

strchr 活用後，特定的字元的前後字串分割並取得可以。

執行結果

#include <stdio.h>
#include <string.h>

void split_email(const char *email, char *username, char *domain) {
    char *pos = strchr(email, '@'); // Search for the position of '@'

    if (pos != NULL) {
        strncpy(username, email, pos - email); // Copy the part before '@'
        username[pos - email] = '\0';          // Add null terminator
        strcpy(domain, pos + 1);               // Copy the part after '@'
    }
}

int main() {
    char email[] = "user@example.com";
    char username[50], domain[50];

    split_email(email, username, domain);
    printf("Username: %s\n", username);
    printf("Domain: %s\n", domain);

    return 0;
}

重點

Username: user
Domain: example.com

7.4 應用: 抽取 HTML 標籤內的特定屬性

strchr'@'
strncpy'@' 之前的部分，並添加空字符
strcpy'@' 之後的部分複製

使用範例: `<a href="URL">` 從中取得 URL

HTML 標籤中若要取得特定屬性，也可以活用 strstr。

執行結果

#include <stdio.h>
#include <string.h>

void get_href(const char *html, char *url) {
    char *start = strstr(html, "href=\""); // Search for the position of href="
    if (start != NULL) {
        start += 6; // Move past href="
        char *end = strchr(start, '"'); // Search for the next "
        if (end != NULL) {
            strncpy(url, start, end - start);
            url[end - start] = '\0'; // Add null terminator
        }
    }
}

int main() {
    char html[] = "<a href=\"https://example.com\">Click Here</a>";
    char url[100];

    get_href(html, url);
    printf("Extracted URL: %s\n", url);

    return 0;
}

重點

Extracted URL: https://example.com

7.5 小結

strstr"href=\"
strchr"

結論

処理内容	使用関数	優點
取得特定字元之前的內容	`strchr` / `strrchr`	簡單且高速
取得特定字符之後	`strstr`	可進行關鍵字搜尋
以特定字元分割前後	`strchr` + `strncpy`	使用者名稱・域名分割等的便利
取得 HTML 標籤屬性	`strstr` + `strchr`	可應用於網頁抓取

8. 總結

strchr 和 strstr 的使用，可以輕鬆取得特定字元或關鍵字的前後文字
文件路徑處理、URL 解析、電子郵件地址分割等，在許多場合都很有用
Web 抓取等高級處理也可應用

8.1 文章回顧

在本文中，C 語言中字串的切取方法，我們從基礎到應用詳細說明了。在此，我們回顧各章節的重要要點，並依用途整理最適的方法。

8.2 依用途的最佳方法

章節	内容	重要重點
C 語言字串的基礎	在 C 語言中，字串被視為陣列，終止字元很重要	處理字串時，
在標準庫中的切割	`strncpystrchr`	`strncpy`空字元終止符
自製函式的切割	靈活的函式建立	`malloc`可變長的子字串取得是可能的
文字編碼別處理	UTF-8、Shift_JIS、EUC-JP 的處理方法	`mbstowcswcstombs`轉換為寬字元是安全
字串分割方法	`strtokstrtok_r`	`strtok`原始字串會被改變
提取特定字符前後	`strchrstrstr`	檔案名稱取得、URL 解析、HTML 解析

1. 部分字串的切取

2. 字串的分割

使用場面	最佳方法
想取得一定長度的字串	`strncpy` or `substring()`
我想進行安全的切割	`strncpy_s`（C11以降）
處理多位元字元（UTF-8, Shift_JIS, EUC-JP）	`mbstowcs` / `wcstombs`

3. 取得特定字元的前後

使用場面	最佳方法
簡單地分隔字串	`strtok`
想要進行執行緒安全的分割	`strtok_r`
想在不改變原始字串的前提下分割。	自作関数（`split_string()`）

8.3 C 語言字串處理的注意事項

使用場面	最佳方法
從檔案路徑取得檔案名稱	`strrchr(path, '/')`
從 URL 取得域名部分	`strstr(url, "://")`
從電子郵件地址分離使用者名稱和域名	`strchr(email, '@')`
從 HTML 標籤中取得屬性值	`strstr(tag, "href=\"")` + `strchr(tag, '"')`

1. 徹底管理空字元 `'安全的字串複製範例 '` 的結尾

2. 注意緩衝區溢位

在 C 語言的字串處理中，終止字元 '\0' 的適當管理是最重要的。特別是在使用 strncpy 或 strchr 時，請注意需要手動加入 null 字元。

3. 多位元字元的處理使用 `mbstowcs`

#include <stdio.h>
#include <string.h>

int main() {
    char src[] = "Hello, World!";
    char dest[6];

    strncpy(dest, src, 5);
    dest[5] = '\0'; // Add null terminator for safety

    printf("Substring: %s\n", dest);

    return 0;
}

4. 緩衝區大小的管理

在 C 語言的字串操作中，需要謹慎實作以避免存取陣列範圍外。特別是在使用 strncpy 時，控制要複製的位元組數是很重要的。

安全的字串複製範例

#include <stdio.h>
#include <string.h>

int main() {
    char src[] = "Hello, World!";
    char dest[6];

    strncpy(dest, src, sizeof(dest) - 1);
    dest[5] = '\0'; // Explicitly add null terminator

    printf("Substring: %s\n", dest);
    return 0;
}

8.4 為了進一步學習

在處理 UTF-8 或 Shift_JIS 等多位元字元時，單純使用 strncpy 或 strlen 可能無法正確運作。

因此，處理多位元字元時，建議先使用 mbstowcs 轉換為寬字元字串，再進行適當的處理。

#include <stdio.h>
#include <wchar.h>
#include <locale.h>

int main() {
    setlocale(LC_ALL, ""); // Set the locale

    char text[] = "こんにちは、世界！"; // UTF-8
    wchar_t wtext[256];

    mbstowcs(wtext, text, 256); // Convert to wide-character string

    printf("Converted wide-character string: %ls\n", wtext);

    return 0;
}

深化學習的主題

在字串處理中，事先計算 所需的記憶體大小，並防止緩區溢位是很重要的。特別是在使用 malloc 取得動態記憶體時，請確實掌握其大小。

8.5 總結

C 語言的字串處理是提升程式安全性與可讀性的重要技能。根據本文介紹的內容，若再學習以下主題，將能進行更高階的字串處理。

深化學習的主題

正規表現（regex）
檔案操作（使用 fgets、fscanf 進行字串處理）
記憶體管理（使用 malloc, realloc 進行動態字串處理）
資料分析（JSON、XML 的解析方法）

8.5 總結

C 語言的字串是 char 陣列管理，因此結尾字元 '\0' 的處理很重要
要切割子字串，請使用 strncpy, substring(), malloc。
文字列的分割可以使用 strtok / strtok_r / 使用自製函式
若要取得特定字元的前後，請使用 strchr, strstr
若要處理多位元組文字（日本語）時，mbstowcs 使用
請注意安全的字串處理，避免緩衝區溢位

只要活用本文內容，就能在 C 語言中實現 實用的字串處理。在理解基本函式的基礎上，挑戰自製函式與應用處理，寫出更有效率的程式碼吧！

C語言字串切割完整指南｜標準函式・自製函式・多位元組支援

1. 前言

在本文中可學到的內容

為何在 C 語言中字串切取很重要？

1. 輸入資料的處理

2. 搜尋特定關鍵字

3. 提升程式的安全性

本文結構

2. C言語的字串是什麼？基本概念與終端字元的重要性

2.1 C言語字串的基本概念

字串是「char的陣列」

字串的內部結構

2.2 終端字元（空字元 '什麼是空字元？'）的重要性

什麼是空字元？

缺少空字元時的問題

2.3 正確的字串定義方法

方法① 使用字串常值

方法② 明確定義陣列

2.4 確認字串大小的方法

strlen 的運作

2.5 總結

3. C語言中切出子字串的方法【標準函式庫篇】

3.1 strncpy 使用取得子字串

strncpy 的基本語法

基本使用範例

strncpy 的注意事項

3.2 strncpy_s 使用安全的字串複製

strncpy_s 的基本語法

使用範例

strncpy_s 的優點

3.3 strchr 使用至特定字元的切出

strchr 的基本語法

使用範例

要點

3.4 strstr 使用關鍵字搜尋與切出

strstr 的基本語法

使用範例

要點

3.5 小結

4. C語言中切割子字串的方法【自製函式篇】

4.1 製作自製函式的好處

4.2 基本的子字串抽取函式

函式規格

實作程式碼

重點

4.3 malloc 使用的動態子字串取得

函式規格

實作程式碼

重點

4.4 多位元組字元（日文）支援

考慮多位元組字元的實作

實作程式碼（UTF-8 支援）

重點

4.5 小結

5. 依照字元編碼的字串切取方法

5.1 ASCII（1位元組字元）的情況

基本的子字串取得

實作範例

5.2 UTF-8（多位元組字元）的情況

UTF-8 的特性

正確的處理方法

支援 UTF-8 的子字串取得

5.3 Shift_JIS（多位元組字元）的情況

Shift_JIS 的特性

支援 Shift_JIS 的子字串取得

Shift_JIS 的實作

5.4 EUC-JP（多位元組字元）的情況

EUC-JP 的特性

支援 EUC-JP 的子字串取得

5.5 總結

6. C語言中分割字串的方法

6.1 strtok 使用的字串分割

基本語法

使用例：逗號 , 進行字串分割

執行結果

strtok 的注意事項

6.2 strtok_r 使用的執行緒安全字串分割

基本語法

使用例：空格 進行字串分割

strtok_r 的優點

2.2 終端字元（空字元 `'什麼是空字元？ '`）的重要性

`strlen` 的運作

3.1 `strncpy` 使用取得子字串

`strncpy` 的基本語法

`strncpy` 的注意事項

3.2 `strncpy_s` 使用安全的字串複製

`strncpy_s` 的基本語法

`strncpy_s` 的優點

3.3 `strchr` 使用至特定字元的切出

`strchr` 的基本語法

3.4 `strstr` 使用關鍵字搜尋與切出

`strstr` 的基本語法

4.3 `malloc` 使用的動態子字串取得

6.1 `strtok` 使用的字串分割

使用例：逗號 `,` 進行字串分割

`strtok` 的注意事項

6.2 `strtok_r` 使用的執行緒安全字串分割

使用例：空格進行字串分割

`strtok_r` 的優點

6.3 使用自製函式分割字串（不使用 `strtok` 的方法）

7.1 `strchr` 使用取得特定字元前的字串

7.2 `strstr` 使用取得特定關鍵字之後的字串

7.3 `strchr` 使用分割特定字元前後的部分

使用範例: `<a href="URL">` 從中取得 URL

1. 徹底管理空字元 `'安全的字串複製範例 '` 的結尾

3. 多位元字元的處理使用 `mbstowcs`