当前位置:知识文库 ❯ 图文

Python读写JSON数据

一、JSON文件操作概述

JSON（JavaScript Object Notation）是一种轻量级的数据交换格式，以键值对的方式组织数据。它广泛应用于Web API、配置文件、数据存储等场景。Python内置的 json 模块提供了完整的JSON序列化和反序列化功能。

JSON与Python的数据类型可以无缝映射：Python的字典对应JSON对象，列表/元组对应JSON数组，字符串、数字、布尔值和None也有对应的JSON表示。理解这些映射关系是正确使用json模块的基础。

二、json模块核心语法

json模块提供了四个核心函数，分为文件操作和字符串操作两类：

函数	功能	方向
`json.dump()`	Python对象写入JSON文件	Python → 文件
`json.load()`	从JSON文件读取为Python对象	文件 → Python
`json.dumps()`	Python对象转为JSON字符串	Python → 字符串
`json.loads()`	JSON字符串转为Python对象	字符串 → Python

记忆技巧：dumps 和 loads 中的 s 代表 string（字符串），表示操作对象是字符串而非文件。

三、序列化：Python对象转JSON

json.dumps() — 转为JSON字符串

代码示例

import json

# Python字典转为JSON字符串
user = {
    "name": "张三",
    "age": 28,
    "city": "北京",
    "skills": ["Python", "JavaScript", "SQL"]
}

# 基本转换
json_str = json.dumps(user)
print(json_str)
# {"name": "\u5f20\u4e09", "age": 28, "city": "\u5317\u4eac", "skills": ["Python", "JavaScript", "SQL"]}

# 格式化输出（缩进4个空格）
json_str_pretty = json.dumps(user, indent=4, ensure_ascii=False)
print(json_str_pretty)
# {
#     "name": "张三",
#     "age": 28,
#     "city": "北京",
#     "skills": [
#         "Python",
#         "JavaScript",
#         "SQL"
#     ]
# }

json.dump() — 写入JSON文件

代码示例

import json

data = {
    "employees": [
        {"name": "张三", "department": "技术部", "salary": 12000},
        {"name": "李四", "department": "市场部", "salary": 9500},
        {"name": "王五", "department": "人事部", "salary": 8000}
    ]
}

# 将数据写入JSON文件
with open("employees.json", "w", encoding="utf-8") as f:
    json.dump(data, f, indent=2, ensure_ascii=False)

print("JSON文件写入成功！")

# 读取验证
with open("employees.json", "r", encoding="utf-8") as f:
    content = f.read()
    print(content)

四、反序列化：JSON转Python对象

json.loads() — JSON字符串转Python对象

代码示例

import json

json_str = '{"name": "张三", "age": 28, "married": true}'

# JSON字符串转为Python字典
user = json.loads(json_str)

print(f"姓名: {user['name']}")    # 张三
print(f"年龄: {user['age']}")      # 28
print(f"已婚: {user['married']}")  # True

# 查看类型
print(type(user))  # <class 'dict'>

json.load() — 从JSON文件读取

代码示例

import json

# 假设config.json文件内容如下：
# {
#   "database": {
#     "host": "localhost",
#     "port": 3306,
#     "name": "mydb"
#   },
#   "debug": true,
#   "max_connections": 10
# }

# 读取JSON配置文件
with open("config.json", "r", encoding="utf-8") as f:
    config = json.load(f)

# 使用配置数据
print(f"数据库地址: {config['database']['host']}:{config['database']['port']}")
print(f"数据库名称: {config['database']['name']}")
print(f"调试模式: {config['debug']}")

五、中文处理与格式化输出

默认情况下，json.dumps() 会将中文字符转换为 Unicode 转义序列（如 \u5f20\u4e09）。这在人类阅读时不够直观，可以通过 ensure_ascii=False 来保持中文字符原样输出：

代码示例

import json

data = {"消息": "你好世界", "状态": "成功"}

# 默认行为（中文被转义）
print(json.dumps(data))
# {"\u6d88\u606f": "\u4f60\u597d\u4e16\u754c", "\u72b6\u6001": "\u6210\u529f"}

# 保留中文字符
print(json.dumps(data, ensure_ascii=False))
# {"消息": "你好世界", "状态": "成功"}

常用格式化参数

代码示例

import json

data = {"name": "张三", "age": 28, "hobbies": ["读书", "游泳"]}

# indent: 缩进空格数，使输出更易读
print(json.dumps(data, indent=2, ensure_ascii=False))

# sort_keys: 按键名排序输出
print(json.dumps(data, sort_keys=True, indent=2, ensure_ascii=False))

# separators: 自定义分隔符（紧凑格式）
print(json.dumps(data, separators=(",", ":"), ensure_ascii=False))

# 组合使用：写入美观的JSON配置文件
with open("config.json", "w", encoding="utf-8") as f:
    json.dump(data, f, indent=4, sort_keys=True, ensure_ascii=False)

六、JSON与Python类型对照

Python类型	JSON类型	示例
dict	object	{"name": "张三"}
list / tuple	array	[1, 2, 3]
str	string	"hello"
int / float	number	42 / 3.14
True	true	true
False	false	false
None	null	null

小贴士

Python的tuple在序列化为JSON时会变成array（列表），反序列化回来时也变成了list，而不是原来的tuple。如果需要保持tuple类型，需要自定义编码器。此外，JSON的键必须是字符串，如果Python字典使用非字符串类型作为键，json模块会自动将其转换为字符串。

七、综合实战示例

自定义JSON编码器

代码示例

import json
from datetime import datetime

class CustomEncoder(json.JSONEncoder):
    """自定义JSON编码器，处理datetime等特殊类型"""
    
    def default(self, obj):
        if isinstance(obj, datetime):
            return obj.strftime("%Y-%m-%d %H:%M:%S")
        elif isinstance(obj, set):
            return list(obj)
        elif isinstance(obj, bytes):
            return obj.decode("utf-8")
        # 调用父类处理其他类型
        return super().default(obj)

# 使用自定义编码器
data = {
    "timestamp": datetime.now(),
    "tags": {"Python", "JSON", "教程"},
    "active": True
}

json_str = json.dumps(data, cls=CustomEncoder, indent=2, ensure_ascii=False)
print(json_str)

JSON配置文件管理器

代码示例

import json
import os

class ConfigManager:
    """JSON配置文件管理器"""
    
    def __init__(self, filepath):
        self.filepath = filepath
        self.config = {}
        self.load()
    
    def load(self):
        """加载配置文件"""
        if os.path.exists(self.filepath):
            with open(self.filepath, "r", encoding="utf-8") as f:
                self.config = json.load(f)
        else:
            self.config = {}
    
    def save(self):
        """保存配置文件"""
        with open(self.filepath, "w", encoding="utf-8") as f:
            json.dump(self.config, f, indent=4, ensure_ascii=False)
    
    def get(self, key, default=None):
        """获取配置值"""
        return self.config.get(key, default)
    
    def set(self, key, value):
        """设置配置值"""
        self.config[key] = value
        self.save()

# 使用示例
config = ConfigManager("app_config.json")
config.set("database_host", "localhost")
config.set("database_port", 3306)
print(config.get("database_host"))  # localhost

JSON数据合并

代码示例

import json
import glob

def merge_json_files(pattern, output_file):
    """合并多个JSON文件中的列表数据"""
    all_data = []
    
    for filename in glob.glob(pattern):
        with open(filename, "r", encoding="utf-8") as f:
            data = json.load(f)
            if isinstance(data, list):
                all_data.extend(data)
            elif isinstance(data, dict):
                all_data.append(data)
    
    with open(output_file, "w", encoding="utf-8") as f:
        json.dump(all_data, f, indent=2, ensure_ascii=False)
    
    print(f"合并完成，共 {len(all_data)} 条记录")

# 合并所有data_*.json文件
merge_json_files("data_*.json", "merged.json")

八、注意事项

注意1：处理包含中文的JSON数据时，务必使用 ensure_ascii=False，否则中文字符会被转义为Unicode编码，不利于人类阅读。

注意2：JSON不支持Python的所有数据类型，如 set、datetime、自定义对象等无法直接序列化，需要自定义JSONEncoder或使用 default=str 简单转换。

注意3：解析不可信的JSON数据时应使用 json.loads() 而非 eval()，因为eval会执行任意Python代码，存在严重的安全风险。

注意4：大JSON文件处理时，json.load() 会将整个文件加载到内存中。对于超大文件，建议使用逐行读取或使用 ijson 库进行流式解析，避免内存溢出。

九、小结

四个核心函数：dump/load 操作文件，dumps/loads 操作字符串，带s的函数处理字符串类型。
中文处理：使用 ensure_ascii=False 保留中文字符原样输出，indent 参数控制格式化缩进。
类型映射：Python与JSON之间有明确的数据类型对应关系，理解映射是正确序列化/反序列化的前提。
安全解析：始终使用json模块而非eval()来解析JSON数据，避免代码注入风险。

十、练习题

练习1

编写程序，创建一个包含用户信息（姓名、年龄、邮箱、地址）的字典，将其序列化为格式化的JSON字符串并打印，然后写入到 users.json 文件中。

练习2

编写一个函数，读取一个包含图书列表的JSON文件，筛选出评分高于4.0的图书，并将结果写入新的JSON文件中。

练习3

自定义一个JSONEncoder类，支持序列化datetime、set和自定义类的对象。编写测试代码验证自定义编码器能正确处理这些类型。

常见问题

json.dumps和json.dump有什么区别？

dumps（带s）将Python对象转换为JSON字符串并返回；dump（不带s）将Python对象直接写入文件。带s的s代表string，操作的是字符串而不是文件。

为什么JSON中的True是true而不是True？

JSON有自己的数据格式规范，布尔值使用小写的true/false，空值使用null。Python的True/False/None在序列化时会自动转换为对应的JSON格式，反序列化时又会转回来。

如何处理超大JSON文件？

对于超大JSON文件，json.load()会将整个文件加载到内存。建议使用ijson库进行流式解析，或者将JSON文件拆分为多行（JSON Lines格式），逐行处理。

可以用eval()代替json.loads()吗？

绝对不可以。eval()会执行任意Python代码，如果JSON数据来自不可信来源，可能导致严重的安全漏洞（代码注入）。始终使用json.loads()来安全地解析JSON。

标签： JSON json模块序列化数据处理 Python教程

本文涉及AI创作

内容由AI创作，请仔细甄别

list快速访问

poll相关推荐

Django静态文件管理教程：STATIC配置、collectstatic与Nginx部署

Django Migration迁移详解 - 数据库版本控制完整指南

Django ORM查询详解 - 数据库操作与优化完整指南

Django中间件详解 - 请求响应钩子系统完整指南

快速访问

相关推荐