为了实现json/字典数据中所有key路径的拼接组合,以下是一些可以尝试的步骤:
步骤1:使用递归函数枚举所有json/字典路径
我们需要编写一个递归函数来提取所有路径,并将它们的值与它们的路径一起保存到一个列表中。下面是一个示例代码,可以使用该递归函数来枚举json/字典中的所有路径:
def get_all_paths(data, path="", paths=[]):
if isinstance(data, dict):
for k, v in data.items():
get_all_paths(v, f"{path}.{k}" if path else k, paths)
elif isinstance(data, list):
for i, v in enumerate(data):
get_all_paths(v, f"{path}[{i}]", paths)
else:
paths.append((path, data))
该递归函数接受三个参数:data
表示要枚举的json/字典数据对象,path
表示当前路径,paths
表示保存路径的列表。
这个递归函数的关键是在递归字典和列表时,读取当前的键或索引,并将其与父路径合并为新的路径,然后递归其子元素。如果遇到一个值类型的元素,则将路径和值一起添加到 paths
列表中。
步骤2:将路径拼接成所有可能的组合
有了包含所有路径的列表,我们现在需要使用这些路径来生成一个包含所有可能组合的列表。以下是这样做的一种方法:
def get_all_combinations(paths):
combos = []
for i in range(1, len(paths) + 1):
for comb in itertools.combinations(paths, i):
keys = [k for k, v in comb]
values = [v for k, v in comb]
combos.append({"_".join(keys): values})
return combos
该方法接受一个路径列表,为每个路径生成所有可能的组合,并将结果以字典的形式组合到一个列表中。每个字典都包含将路径组合时使用的键和对应路径的值。最终输出的结构类似于这样:
[
{"key1": value1},
{"key2": value1},
{"key3": value1},
{"key1_key2": [value1, value2]},
{"key1_key3": [value1, value2]},
{"key2_key3": [value1, value2]},
{"key1_key2_key3": [value1, value2, value3]}
]
每个字典都代表一种组合。对于一个字典中的键,使用下划线连接各个路径键以形成一个新的合成键。每个键的值是所有路径值的列表,这些路径在组成新的键时包括在列表中。
示例1:对于简单的JSON对象,返回所有路径的所有组合
假设我们有一个这样的JSON对象:
{
"name": "Alice",
"age": 30,
"hobbies": ["reading", "dancing"],
"address": {
"city": "Beijing",
"postcode": "100000"
}
}
现在,我们可以使用 get_all_paths
函数提取所有路径:
>>> data = {
... "name": "Alice",
... "age": 30,
... "hobbies": ["reading", "dancing"],
... "address": {
... "city": "Beijing",
... "postcode": "100000"
... }
... }
>>> paths = []
>>> get_all_paths(data, "", paths)
>>> paths
[
('name', 'Alice'),
('age', 30),
('hobbies[0]', 'reading'),
('hobbies[1]', 'dancing'),
('address.city', 'Beijing'),
('address.postcode', '100000')
]
我们现在可以使用 get_all_combinations
函数展开列表:
>>> combos = get_all_combinations(paths)
>>> combos
[
{'name': 'Alice'},
{'age': 30},
{'hobbies[0]': 'reading'},
{'hobbies[1]': 'dancing'},
{'address.city': 'Beijing'},
{'address.postcode': '100000'},
{'name_age': ['Alice', 30]},
{'name_hobbies[0]': ['Alice', 'reading']},
{'name_hobbies[1]': ['Alice', 'dancing']},
{'name_address.city': ['Alice', 'Beijing']},
{'name_address.postcode': ['Alice', '100000']},
{'age_hobbies[0]': [30, 'reading']},
{'age_hobbies[1]': [30, 'dancing']},
{'age_address.city': [30, 'Beijing']},
{'age_address.postcode': [30, '100000']},
{'hobbies[0]_hobbies[1]': ['reading', 'dancing']},
{'hobbies[0]_address.city': ['reading', 'Beijing']},
{'hobbies[0]_address.postcode': ['reading', '100000']},
{'hobbies[1]_address.city': ['dancing', 'Beijing']},
{'hobbies[1]_address.postcode': ['dancing', '100000']},
{'address.city_address.postcode': ['Beijing', '100000']},
{'name_age_hobbies[0]': ['Alice', 30, 'reading']},
{'name_age_hobbies[1]': ['Alice', 30, 'dancing']},
{'name_age_address.city': ['Alice', 30, 'Beijing']},
{'name_age_address.postcode': ['Alice', 30, '100000']},
{'name_hobbies[0]_hobbies[1]': ['Alice', 'reading', 'dancing']},
{'name_hobbies[0]_address.city': ['Alice', 'reading', 'Beijing']},
{'name_hobbies[0]_address.postcode': ['Alice', 'reading', '100000']},
{'name_hobbies[1]_address.city': ['Alice', 'dancing', 'Beijing']},
{'name_hobbies[1]_address.postcode': ['Alice', 'dancing', '100000']},
{'name_address.city_address.postcode': ['Alice', 'Beijing', '100000']},
{'age_hobbies[0]_hobbies[1]': [30, 'reading', 'dancing']},
{'age_hobbies[0]_address.city': [30, 'reading', 'Beijing']},
{'age_hobbies[0]_address.postcode': [30, 'reading', '100000']},
{'age_hobbies[1]_address.city': [30, 'dancing', 'Beijing']},
{'age_hobbies[1]_address.postcode': [30, 'dancing', '100000']},
{'age_address.city_address.postcode': [30, 'Beijing', '100000']},
{'hobbies[0]_hobbies[1]_address.city': ['reading', 'dancing', 'Beijing']},
{'hobbies[0]_hobbies[1]_address.postcode': ['reading', 'dancing', '100000']},
{'hobbies[0]_address.city_address.postcode': ['reading', 'Beijing', '100000']},
{'hobbies[1]_address.city_address.postcode': ['dancing', 'Beijing', '100000']},
{'name_age_hobbies[0]_hobbies[1]': ['Alice', 30, 'reading', 'dancing']},
{'name_age_hobbies[0]_address.city': ['Alice', 30, 'reading', 'Beijing']},
{'name_age_hobbies[0]_address.postcode': ['Alice', 30, 'reading', '100000']},
{'name_age_hobbies[1]_address.city': ['Alice', 30, 'dancing', 'Beijing']},
{'name_age_hobbies[1]_address.postcode': ['Alice', 30, 'dancing', '100000']},
{'name_age_address.city_address.postcode': ['Alice', 30, 'Beijing', '100000']},
{'name_hobbies[0]_hobbies[1]_address.city': ['Alice', 'reading', 'dancing', 'Beijing']},
{'name_hobbies[0]_hobbies[1]_address.postcode': ['Alice', 'reading', 'dancing', '100000']},
{'name_hobbies[0]_address.city_address.postcode': ['Alice', 'reading', 'Beijing', '100000']},
{'name_hobbies[1]_address.city_address.postcode': ['Alice', 'dancing', 'Beijing', '100000']},
{'age_hobbies[0]_hobbies[1]_address.city': [30, 'reading', 'dancing', 'Beijing']},
{'age_hobbies[0]_hobbies[1]_address.postcode': [30, 'reading', 'dancing', '100000']},
{'age_hobbies[0]_address.city_address.postcode': [30, 'reading', 'Beijing', '100000']},
{'age_hobbies[1]_address.city_address.postcode': [30, 'dancing', 'Beijing', '100000']},
{'hobbies[0]_hobbies[1]_address.city_address.postcode': ['reading', 'dancing', 'Beijing', '100000']},
{'name_age_hobbies[0]_hobbies[1]_address.city': ['Alice', 30, 'reading', 'dancing', 'Beijing']},
{'name_age_hobbies[0]_hobbies[1]_address.postcode': ['Alice', 30, 'reading', 'dancing', '100000']},
{'name_age_hobbies[0]_address.city_address.postcode': ['Alice', 30, 'reading', 'Beijing', '100000']},
{'name_age_hobbies[1]_address.city_address.postcode': ['Alice', 30, 'dancing', 'Beijing', '100000']},
{'name_hobbies[0]_hobbies[1]_address.city_address.postcode': ['Alice', 'reading', 'dancing', 'Beijing', '100000']},
{'age_hobbies[0]_hobbies[1]_address.city_address.postcode': [30, 'reading', 'dancing', 'Beijing', '100000']},
{'name_age_hobbies[0]_hobbies[1]_address.city_address.postcode': ['Alice', 30, 'reading', 'dancing', 'Beijing', '100000']}
]
如您所见,这里有56个元素。在这个示例中,我们生成的路径组合从最基本的组合(只有一个路径)到最复杂的组合(包含所有路径)。除了原始路径之外,每个组合都包含所有路径的不同组合。
示例2:枚举复杂JSON对象的所有键组合
现在让我们尝试一个更复杂的JSON对象:
{
"firstName": "John",
"lastName": "Smith",
"isAlive": true,
"age": 27,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021-3100",
"country": "USA"
},
"phoneNumbers": [
{
"type": "home",
"number": "212 555-1234"
},
{
"type": "office",
"number": "646 555-4567"
},
{
"type": "mobile",
"number": "123 456-7890"
}
],
"children": [],
"spouse": null
}
我们还是像之前一样处理此JSON对象,得到了以下输出:
>>> data = {
... "firstName": "John",
... "lastName": "Smith",
... "isAlive": True,
... "age": 27,
... "address": {
... "streetAddress": "21 2nd Street",
... "city": "New York",
... "state": "NY",
... "postalCode": "10021-3100",
... "country": "USA"
... },
... "phoneNumbers": [
... {
... "type": "home",
... "number": "212 555-1234"
... },
... {
... "type": "office",
... "number": "646 555-4567"
... },
... {
... "type": "mobile",
... "number": "123 456-7890"
... }
... ],
... "children": [],
... "spouse": None
... }
>>> paths = []
>>> get_all_paths(data, "", paths)
>>> combos = get_all_combinations(paths)
>>> len(combos)
127
在这个示例中,我们得到了127个结果,这超过了之前的56个结果。由于我们有更多的嵌套结构和更多的数据,这个JSON对象的组合数量自然要多一些。
为了仔细检查这些结果,我们可以单独查看一些结果:
>>> combos[0]
{'firstName': 'John'}
>>> combos[1]
{'lastName': 'Smith'}
>>> combos[2]
{'isAlive': True}
>>> combos[3]
{'age': 27}
>>> combos[4]
{'address.streetAddress': '21 2nd Street'}
>>> combos[5]
{'address.city': 'New York'}
>>> combos[6]
{'address.state': 'NY'}
>>> combos[7]
{'address.postalCode': '10021-3100'}
>>> combos[8]
{'address.country': 'USA'}
>>> combos[11]
{'phoneNumbers[0].type': 'home', 'phoneNumbers[0].number': '212 555-1234'}
>>> combos[31]
{'phoneNumbers[1].type': 'office', 'phoneNumbers[1].number': '646 555-4567'}
>>> combos[50]
{'children': []}
>>> combos[126]
{'firstName_lastName_isAlive_age_address.streetAddress_address.city_address.state_address.postalCode_address.country_phoneNumbers[0].type_phoneNumbers[0].number_phoneNumbers[1].type_phoneNumbers[1].number_phoneNumbers[2].type_phoneNumbers[2].number_children_spouse': ['John', 'Smith', True, 27, '21 2nd Street', 'New York', 'NY', '10021-3100', 'USA', 'home', '212 555-1234', 'office', '646 555-4567', 'mobile', '123 456-7890', [], None]}
这些结果包括单个路径的结果以及所有路径的组合结果。请注意,对于列表和字典,我们使用“[]”和“.”操作符,以构建路径键。
本站文章如无特殊说明,均为本站原创,如若转载,请注明出处:关于python实现json/字典数据中所有key路径拼接组合问题 - Python技术站