← 目录 / 第十一章 · STL 标准模板库 / 11.13.5 集合运算

11.13.5　集合运算

set_union、set_intersection、set_difference —— 对两个已排序序列做并集、交集、差集运算。

先导课 1 +

STL 是什么

序列容器 4 +

string vector list deque

容器适配器 3 +

stack queue priority_queue

关联容器 3 +

map unordered_map set

实用类型 2 +

pair bitset

STL 算法 6

排序与排列查找与计数复制替换移动数值运算集合运算条件判断遍历

📂 引入头文件：#include <algorithm> —— 本节算法全部在这里，但都要求两个输入序列已经排好序。

本页目录

①set_union 并集
②set_intersection 交集
③set_difference 差集

这三个函数都对两个已排序的序列做集合运算，写法几乎一样：传入两个序列各自的 [begin, end)，再传入存放结果的起始位置。返回值是结果的尾后迭代器，配合 erase 才能让 result 容器的大小刚好等于真实结果长度。

a = {1, 2, 3, 4, 5}　b = {3, 4, 5, 6, 7}

并集 a ∪ b

交集 a ∩ b

差集 a − b

蓝色格子是只在 a 中的元素，粉色格子是只在 b 中的元素，绿色格子是两边都有的元素——三个集合运算分别取了不同部分的组合。

① set_union —— 并集

合并两个序列中出现过的所有元素（重复元素只保留一份），结果同样保持有序。

C++ · set_union 求并集

        
          1vector<int> a = {1, 2, 3, 4, 5};
2vector<int> b = {3, 4, 5, 6, 7};
3vector<int> result;
4
5// 并集：a ∪ b
6result.resize(a.size() + b.size());
7auto it = set_union(a.begin(), a.end(), b.begin(), b.end(), result.begin());
8result.erase(it, result.end());
9// result = {1, 2, 3, 4, 5, 6, 7}

      

📌

为什么要先 resize？这三个集合运算函数都是把结果写入到 result.begin() 开始的位置，而不是自动 push_back——如果 result 本身空间不够，会越界。并集结果最多有 a.size() + b.size() 个元素，所以按这个上限先分配空间最安全，多余的部分用 erase 删掉。

② set_intersection —— 交集

只保留两个序列都出现过的元素。

C++ · set_intersection 求交集

        
          1// 交集：a ∩ b
2result.resize(min(a.size(), b.size()));
3it = set_intersection(a.begin(), a.end(), b.begin(), b.end(), result.begin());
4result.erase(it, result.end());
5// result = {3, 4, 5}

      

💡

交集的结果不会超过较短那个序列的长度（最多和短的那个一样长），所以用 min(a.size(), b.size()) 来预分配空间最合适。

③ set_difference —— 差集

保留在 a 中但不在 b 中的元素——注意差集是有方向的，set_difference(a, b) 和 set_difference(b, a) 结果不一样。

C++ · set_difference 求差集

        
          1// 差集：a - b（在 a 中但不在 b 中）
2result.resize(a.size());
3it = set_difference(a.begin(), a.end(), b.begin(), b.end(), result.begin());
4result.erase(it, result.end());
5// result = {1, 2}

      

⚠️

别忘了排序：这三个函数都要求输入序列已经有序，否则结果是未定义的。如果数据本身没排序，要先各自 sort 一遍。如果你存的本身就是 set 或者 map（天然有序），可以直接拿 begin() / end() 用，不需要再排序。

🎯

这三个函数的时间复杂度都是 O(n + m)（n、m 分别是两个序列的长度）——比"对每个元素用 find 暴力查找另一个序列"的 O(n × m) 快得多，原理类似归并排序里的双指针合并。

1	vector<int> a = {1, 2, 3, 4, 5};
2	vector<int> b = {3, 4, 5, 6, 7};
3	vector<int> result;
4
5	// 并集：a ∪ b
6	result.resize(a.size() + b.size());
7	auto it = set_union(a.begin(), a.end(), b.begin(), b.end(), result.begin());
8	result.erase(it, result.end());
9	// result = {1, 2, 3, 4, 5, 6, 7}

1	// 交集：a ∩ b
2	result.resize(min(a.size(), b.size()));
3	it = set_intersection(a.begin(), a.end(), b.begin(), b.end(), result.begin());
4	result.erase(it, result.end());
5	// result = {3, 4, 5}

1	// 差集：a - b（在 a 中但不在 b 中）
2	result.resize(a.size());
3	it = set_difference(a.begin(), a.end(), b.begin(), b.end(), result.begin());
4	result.erase(it, result.end());
5	// result = {1, 2}