Compare commits

...

34 Commits

Author SHA1 Message Date
z060142
2c8a9e4588
Merge pull request #13 from z060142/Temporary-solution
Temporary solution
2025-05-17 03:52:52 +08:00
z060142
f9457bf992 Replace text deduplication with difflib-based similarity matching to reduce false negatives 2025-05-17 02:16:41 +08:00
z060142
a8603d4d45 Refine pause/resume behavior handling 2025-05-16 11:47:31 +08:00
z060142
e3e3d3b914 Fix duplicate log print issue 2025-05-16 02:26:28 +08:00
z060142
dad375dec8 Clean up redundant code and adjust initial game window size 2025-05-16 02:13:37 +08:00
z060142
2ac63718a9 Implement perceptual image hash deduplication for bubble processing
- Added `simple_bubble_dedup.py` module using perceptual hashing (pHash) to detect duplicate chat bubbles based on visual similarity.
- System keeps recent N (default 5) hashes and skips bubbles with Hamming distance below threshold (default 5).
- Integrated into `run_ui_monitoring_loop()`:
  - Hash is computed upon bubble snapshot capture.
  - Duplicate check occurs before message enqueue.
  - Sender info is optionally attached to matching hash entries after successful processing.
- Deduplication state is persisted in `simple_bubble_dedup.json`.
- F7 (`clear_history`) and F8 (`reset_state`) now also clear image-based hash history.
- Removed or commented out legacy `recent_texts` text-based deduplication logic.

This visual deduplication system reduces false negatives caused by slight text variations and ensures higher confidence in skipping repeated bubble interactions.
2025-05-16 02:02:31 +08:00
z060142
0b794a4c32 Add recovery mechanism for unexpected window closure (finally) 2025-05-15 12:13:51 +08:00
z060142
677a73f026 Force multiple window topmost strategies to prevent focus loss 2025-05-15 11:18:54 +08:00
z060142
890772f70e Add message deduplication system and UI fallback handling for updated game states
- Implemented `MessageDeduplication` class to suppress duplicate bot replies:
  - Normalizes sender and message content for reliable comparison.
  - Tracks processed messages with timestamp-based expiry (default 1 hour).
  - Integrated into `run_ui_monitoring_loop()` with support for F7/F8-based history resets.
  - Periodic cleanup thread purges expired entries every 10 minutes.

- Added new UI fallback handling logic to address post-update game state changes:
  - Detects `chat_option.png` overlay before bubble detection and presses ESC to dismiss.
  - Detects `update_confirm.png` when chat room state is unavailable and clicks it to proceed.
  - Both behaviors improve UI stability following game version changes.

- Updated `essential_templates` dictionary and constants with the two new template paths:
  - `chat_option.png`
  - `update_confirm.png`

These improvements reduce redundant bot responses and enhance UI resilience against inconsistent or obstructed states in the latest game versions.
2025-05-15 02:16:24 +08:00
z060142
2836ce899d
Merge pull request #12 from z060142/Temporary-solution
Temporary solution
2025-05-13 04:53:40 +08:00
z060142
51a99ee5ad Refactor Game Monitor into Game Manager with Setup.py integration and full process control
- Replaced legacy `game_monitor.py` with a new modular `game_manager.py`.
- Introduced `GameMonitor` class to encapsulate:
  - Game window detection, focus enforcement, and resize enforcement.
  - Timed game restarts based on configuration interval.
  - Callback system to notify Setup.py on restart completion.
  - Cross-platform game launching (Windows/Unix).
  - Process termination using `psutil` if available.

- `Setup.py` now acts as the control hub:
  - Instantiates and manages `GameMonitor`.
  - Provides live configuration updates (e.g., window title, restart timing).
  - Coordinates bot lifecycle with game restarts.

- Maintains standalone execution mode for `game_manager.py` (for testing or CLI use).
- Replaces older “always-on-top” logic with foreground window activation.
- Dramatically improves control, flexibility, and automation reliability for game-based workflows.
2025-05-13 03:40:14 +08:00
z060142
a5b6a44164 Replace always-on-top with foreground activation for game window focus 2025-05-12 23:52:32 +08:00
z060142
59471b62ce Improve game window topmost handling and add forced reconnection for remote control stability 2025-05-12 23:17:07 +08:00
z060142
b33ea85768 Added new styles for speech bubbles for detection 2025-05-10 01:09:51 +08:00
z060142
4a03ca4424 Add character limit to user profile entries 2025-05-09 20:14:44 +08:00
z060142
7d9ead1c60 update memory backup scripts 2025-05-09 13:13:58 +08:00
z060142
bccc6d413f Migrate ChromaDB embedding model to paraphrase-multilingual-mpnet-base-v2 2025-05-09 12:32:06 +08:00
z060142
65df12a20e Fix encoding issues, enhance ChromaDB reader with ID query and embedding model selection 2025-05-09 11:29:56 +08:00
z060142
2a68f04e87 Add something 2025-05-08 03:54:34 +08:00
z060142
4dd5d91029 Fix something 2025-05-08 03:24:44 +08:00
z060142
48c0c25a42 Extend ChromaDB memory system with scheduled tasks and Setup UI support
- Added new scripts to manage ChromaDB memory processing and periodic scheduling (e.g. compaction, deduplication, reindexing).
- Optimized chatbot memory usage by improving base memory retrieval logic and preload strategy.
- Updated Setup.py UI to include scheduling options for memory maintenance tasks.
- Ensures better long-term memory performance, avoids memory bloat, and enables proactive management of large-scale memory datasets.
2025-05-08 03:08:51 +08:00
z060142
ce111cf3d5 Enhanced server connection stability 2025-05-07 23:07:54 +08:00
z060142
a29d336df0 Add remote control system for evaluation 2025-05-07 04:16:23 +08:00
z060142
6cffa4c70c
Merge pull request #11 from z060142/Refactoring
Refactoring
2025-05-07 01:11:58 +08:00
z060142
90b3a492d7
Merge pull request #10 from z060142/Refactoring
Refactor keyword detection with dual-template matching and coordinate correction
2025-05-01 22:16:14 +08:00
z060142
42a6bde23f
Merge pull request #9 from z060142/Refactoring
Refactoring
2025-05-01 04:45:00 +08:00
z060142
7e4383fa98
Merge pull request #8 from z060142/Refactoring
Major progress! Now LLM can call Tool use in the dialogue more smoothly
2025-04-28 00:17:56 +08:00
z060142
74270aace7
Merge pull request #7 from z060142/Refactoring
Enhance llm_debug_script.py to allow dynamic username input
2025-04-27 23:36:18 +08:00
z060142
30e418eba4
Merge pull request #6 from z060142/Refactoring
Improve the setup process
2025-04-27 23:17:13 +08:00
z060142
5cba0b970c
Merge pull request #5 from z060142/Refactoring
Add Bot setup UI
2025-04-26 14:03:05 +08:00
z060142
583600760b
Merge pull request #4 from z060142/Refactoring
Refactoring
2025-04-26 10:44:47 +08:00
z060142
a9ff1959ef
Merge pull request #3 from z060142/Refactoring
Update
2025-04-21 19:02:51 +08:00
z060142
381b40c62f
Merge pull request #2 from z060142/Refactoring
Enhance LLM Performance and Multi-Person Chat Stability
2025-04-20 21:51:46 +08:00
z060142
30df8f8320
Merge pull request #1 from z060142/Refactoring
Refactoring and modularize existing work logic
2025-04-18 13:33:52 +08:00
20 changed files with 5213 additions and 705 deletions

5
.gitignore vendored
View File

@ -3,8 +3,11 @@
llm_debug.log llm_debug.log
config.py config.py
config.py.bak config.py.bak
simple_bubble_dedup.json
__pycache__/ __pycache__/
debug_screenshots/ debug_screenshots/
chat_logs/ chat_logs/
backup/ backup/
chroma_data/ chroma_data/
wolf_control.py
remote_config.json

View File

@ -15,72 +15,66 @@ Wolf Chat 是一個基於 MCP (Modular Capability Provider) 框架的聊天機
### 核心元件 ### 核心元件
1. **主控模塊 (main.py)** 1. **主控模塊 (main.py)**
- 協調各模塊的工作 - 協調各模塊的工作
- 初始化 MCP 連接 - 初始化 MCP 連接
- **容錯處理**:即使 `config.py` 中未配置 MCP 伺服器或所有伺服器連接失敗程式現在也會繼續執行僅打印警告訊息MCP 功能將不可用。 (Added 2025-04-21) - **容錯處理**:即使 `config.py` 中未配置 MCP 伺服器或所有伺服器連接失敗程式現在也會繼續執行僅打印警告訊息MCP 功能將不可用。 (Added 2025-04-21)
- **伺服器子進程管理 (修正 2025-05-02)**:使用 `mcp.client.stdio.stdio_client` 啟動和連接 `config.py` 中定義的每個 MCP 伺服器。`stdio_client` 作為一個異步上下文管理器,負責管理其啟動的子進程的生命週期。 - **伺服器子進程管理 (修正 2025-05-02)**:使用 `mcp.client.stdio.stdio_client` 啟動和連接 `config.py` 中定義的每個 MCP 伺服器。`stdio_client` 作為一個異步上下文管理器,負責管理其啟動的子進程的生命週期。
- **Windows 特定處理 (修正 2025-05-02)**:在 Windows 上,如果 `pywin32` 可用,會註冊一個控制台事件處理程序 (`win32api.SetConsoleCtrlHandler`)。此處理程序主要用於輔助觸發正常的關閉流程(最終會調用 `AsyncExitStack.aclose()`),而不是直接終止進程。伺服器子進程的實際終止依賴於 `stdio_client` 上下文管理器在 `AsyncExitStack.aclose()` 期間的清理操作。 - **Windows 特定處理 (修正 2025-05-02)**:在 Windows 上,如果 `pywin32` 可用,會註冊一個控制台事件處理程序 (`win32api.SetConsoleCtrlHandler`)。此處理程序主要用於輔助觸發正常的關閉流程(最終會調用 `AsyncExitStack.aclose()`),而不是直接終止進程。伺服器子進程的實際終止依賴於 `stdio_client` 上下文管理器在 `AsyncExitStack.aclose()` 期間的清理操作。
- **記憶體系統初始化 (新增 2025-05-02)**:在啟動時調用 `chroma_client.initialize_memory_system()`,根據 `config.py` 中的 `ENABLE_PRELOAD_PROFILES` 設定決定是否啟用記憶體預載入。 - **記憶體系統初始化 (新增 2025-05-02)**:在啟動時調用 `chroma_client.initialize_memory_system()`,根據 `config.py` 中的 `ENABLE_PRELOAD_PROFILES` 設定決定是否啟用記憶體預載入。
- 設置並管理主要事件循環 - 設置並管理主要事件循環
- **記憶體預載入 (新增 2025-05-02)**:在主事件循環中,如果預載入已啟用,則在每次收到 UI 觸發後、調用 LLM 之前,嘗試從 ChromaDB 預先獲取用戶資料 (`get_entity_profile`)、相關記憶 (`get_related_memories`) 和潛在相關的機器人知識 (`get_bot_knowledge`)。 - **記憶體預載入 (新增 2025-05-02)**:在主事件循環中,如果預載入已啟用,則在每次收到 UI 觸發後、調用 LLM 之前,嘗試從 ChromaDB 預先獲取用戶資料 (`get_entity_profile`)、相關記憶 (`get_related_memories`) 和潛在相關的機器人知識 (`get_bot_knowledge`)。
- 處理程式生命週期管理和資源清理(通過 `AsyncExitStack` 間接管理 MCP 伺服器子進程的終止) - 處理程式生命週期管理和資源清理(通過 `AsyncExitStack` 間接管理 MCP 伺服器子進程的終止)
2. **LLM 交互模塊 (llm_interaction.py)** 2. **LLM 交互模塊 (llm_interaction.py)**
- 與語言模型 API 通信 - 與語言模型 API 通信
- 管理系統提示與角色設定 - 管理系統提示與角色設定
- **條件式提示 (新增 2025-05-02)**`get_system_prompt` 函數現在接受預載入的用戶資料、相關記憶和機器人知識。根據是否有預載入數據,動態調整系統提示中的記憶體檢索協議說明。 - **條件式提示 (新增 2025-05-02)**`get_system_prompt` 函數現在接受預載入的用戶資料、相關記憶和機器人知識。根據是否有預載入數據,動態調整系統提示中的記憶體檢索協議說明。
- 處理語言模型的工具調用功能 - 處理語言模型的工具調用功能
- 格式化 LLM 回應 - 格式化 LLM 回應
- 提供工具結果合成機制 - 提供工具結果合成機制
3. **UI 互動模塊 (ui_interaction.py)** 3. **UI 互動模塊 (ui_interaction.py)**
- 使用圖像辨識技術監控遊戲聊天視窗 - 使用圖像辨識技術監控遊戲聊天視窗
- 檢測聊天泡泡與關鍵字 - 檢測聊天泡泡與關鍵字
- 複製聊天內容和獲取發送者姓名 - 複製聊天內容和獲取發送者姓名
- 將生成的回應輸入到遊戲中 - 將生成的回應輸入到遊戲中
4. **MCP 客戶端模塊 (mcp_client.py)** 4. **MCP 客戶端模塊 (mcp_client.py)**
- 管理與 MCP 服務器的通信 - 管理與 MCP 服務器的通信
- 列出和調用可用工具 - 列出和調用可用工具
- 處理工具調用的結果和錯誤 - 處理工具調用的結果和錯誤
5. **配置模塊 (config.py)** 5. **配置模塊 (config.py)**
- 集中管理系統參數和設定 - 集中管理系統參數和設定
- 整合環境變數 - 整合環境變數
- 配置 API 密鑰和服務器設定 - 配置 API 密鑰和服務器設定
6. **角色定義 (persona.json)** 6. **角色定義 (persona.json)**
- 詳細定義機器人的人格特徵 - 詳細定義機器人的人格特徵
- 包含外觀、說話風格、個性特點等資訊 - 包含外觀、說話風格、個性特點等資訊
- 提供給 LLM 以確保角色扮演一致性 - 提供給 LLM 以確保角色扮演一致性
7. **遊戲視窗監控模組 (game_monitor.py)** (取代 window-setup-script.py 和舊的 window-monitor-script.py) 7. **遊戲管理器模組 (game_manager.py)** (取代舊的 `game_monitor.py`)
- 持續監控遊戲視窗 (`config.WINDOW_TITLE`)。 - **核心類 `GameMonitor`**:封裝所有遊戲視窗監控、自動重啟和進程管理功能。
- 確保視窗維持在設定檔 (`config.py`) 中指定的位置 (`GAME_WINDOW_X`, `GAME_WINDOW_Y`) 和大小 (`GAME_WINDOW_WIDTH`, `GAME_WINDOW_HEIGHT`)。 - **由 `Setup.py` 管理**
- 確保視窗維持在最上層 (Always on Top)。 - 在 `Setup.py` 的 "Start Managed Bot & Game" 流程中被實例化和啟動。
- **定時遊戲重啟** (如果 `config.ENABLE_SCHEDULED_RESTART` 為 True) - 在停止會話時由 `Setup.py` 停止。
- 根據 `config.RESTART_INTERVAL_MINUTES` 設定的間隔執行。 - 設定(如視窗標題、路徑、重啟間隔等)通過 `Setup.py` 傳遞,並可在運行時通過 `update_config` 方法更新。
- **簡化流程 (2025-04-25)** - **功能**
1. 通過 `stdout``main.py` 發送 JSON 訊號 (`{'action': 'pause_ui'}`),請求暫停 UI 監控。 - 持續監控遊戲視窗 (`config.WINDOW_TITLE`)。
2. 等待固定時間30 秒)。 - 確保視窗維持在設定檔中指定的位置和大小。
3. 調用 `restart_game_process` 函數,**嘗試**終止 (`terminate`/`kill`) `LastWar.exe` 進程(**無驗證**)。 - 確保視窗保持活躍(帶到前景並獲得焦點)。
4. 等待固定時間2 秒)。 - **定時遊戲重啟**:根據設定檔中的間隔執行。
5. **嘗試**使用 `os.startfile` 啟動 `config.GAME_EXECUTABLE_PATH`**無驗證**)。 - **回調機制**:重啟完成後,通過回調函數通知 `Setup.py`(例如,`restart_complete``Setup.py` 隨後處理機器人重啟。
6. 等待固定時間30 秒)。 - **進程管理**:使用 `psutil`(如果可用)查找和終止遊戲進程。
7. 使用 `try...finally` 結構確保**總是**執行下一步。 - **跨平台啟動**:使用 `os.startfile` (Windows) 或 `subprocess.Popen` (其他平台) 啟動遊戲。
8. 通過 `stdout``main.py` 發送 JSON 訊號 (`{'action': 'resume_ui'}`),請求恢復 UI 監控。 - **獨立運行模式**`game_manager.py` 仍然可以作為獨立腳本運行 (類似舊的 `game_monitor.py`),此時它會從 `config.py` 加載設定,並通過 `stdout` 發送 JSON 訊號。
- **視窗調整**:遊戲視窗的位置/大小/置頂狀態的調整完全由 `monitor_game_window` 的主循環持續負責,重啟流程不再進行立即調整。
- **作為獨立進程運行**:由 `main.py` 使用 `subprocess.Popen` 啟動,捕獲其 `stdout` (用於 JSON 訊號) 和 `stderr` (用於日誌)。
- **進程間通信**
- `game_monitor.py` -> `main.py`:通過 `stdout` 發送 JSON 格式的 `pause_ui``resume_ui` 訊號。
- **日誌處理**`game_monitor.py` 的日誌被配置為輸出到 `stderr`,以保持 `stdout` 清潔,確保訊號傳遞可靠性。`main.py` 會讀取 `stderr` 並可能顯示這些日誌。
- **生命週期管理**:由 `main.py` 在啟動時創建,並在 `shutdown` 過程中嘗試終止 (`terminate`)。
8. **ChromaDB 客戶端模塊 (chroma_client.py)** (新增 2025-05-02) 8. **ChromaDB 客戶端模塊 (chroma_client.py)** (新增 2025-05-02)
- 處理與本地 ChromaDB 向量數據庫的連接和互動。 - 處理與本地 ChromaDB 向量數據庫的連接和互動。
- 提供函數以初始化客戶端、獲取/創建集合,以及查詢用戶資料、相關記憶和機器人知識。 - 提供函數以初始化客戶端、獲取/創建集合,以及查詢用戶資料、相關記憶和機器人知識。
- 使用 `chromadb.PersistentClient` 連接持久化數據庫。 - 使用 `chromadb.PersistentClient` 連接持久化數據庫。
### 資料流程 ### 資料流程
@ -130,7 +124,14 @@ Wolf Chat 是一個基於 MCP (Modular Capability Provider) 框架的聊天機
* **計算頭像座標**:根據**新**找到的氣泡左上角座標,應用特定偏移量 (`AVATAR_OFFSET_X_REPLY`, `AVATAR_OFFSET_Y_REPLY`) 計算頭像點擊位置。 * **計算頭像座標**:根據**新**找到的氣泡左上角座標,應用特定偏移量 (`AVATAR_OFFSET_X_REPLY`, `AVATAR_OFFSET_Y_REPLY`) 計算頭像點擊位置。
* **互動(含重試)**:點擊計算出的頭像位置,檢查是否成功進入個人資料頁面 (`Profile_page.png`)。若失敗,最多重試 3 次(每次重試前會再次重新定位氣泡)。若成功,則繼續導航菜單複製用戶名稱。 * **互動(含重試)**:點擊計算出的頭像位置,檢查是否成功進入個人資料頁面 (`Profile_page.png`)。若失敗,最多重試 3 次(每次重試前會再次重新定位氣泡)。若成功,則繼續導航菜單複製用戶名稱。
* **原始偏移量**:原始的 `-55` 像素水平偏移量 (`AVATAR_OFFSET_X`) 仍保留,用於 `remove_user_position` 等其他功能。 * **原始偏移量**:原始的 `-55` 像素水平偏移量 (`AVATAR_OFFSET_X`) 仍保留,用於 `remove_user_position` 等其他功能。
5. **防重複處理 (Duplicate Prevention)**:使用最近處理過的文字內容歷史 (`recent_texts`) 防止對相同訊息重複觸發。 5. **防重複處理 (Duplicate Prevention)**
* **基於圖像哈希的去重 (Image Hash Deduplication)**: 新增 `simple_bubble_dedup.py` 模塊,實現基於圖像感知哈希 (Perceptual Hash) 的去重系統。
* **原理**: 系統會計算最近處理過的氣泡圖像的感知哈希值,並保存最近的 N 個 (預設 5 個) 氣泡的哈希。當偵測到新氣泡時,會計算其哈希並與保存的哈希進行比對。如果哈希差異小於設定的閾值 (預設 5),則認為是重複氣泡並跳過處理。
* **實現**: 在 `ui_interaction.py``run_ui_monitoring_loop` 函數中初始化 `SimpleBubbleDeduplication` 實例,並在偵測到關鍵字並截取氣泡快照後,調用 `is_duplicate` 方法進行檢查。
* **狀態管理**: 使用 `simple_bubble_dedup.json` 文件持久化保存最近的氣泡哈希記錄。
* **清理**: F7 (`clear_history`) 和 F8 (`reset_state`) 功能已擴展,會同時清除圖像去重系統中的記錄。
* **發送者信息更新**: 在成功處理並將氣泡信息放入隊列後,會嘗試更新去重記錄中對應氣泡的發送者名稱。
* **文字內容歷史 (已棄用)**: 原有的基於 `recent_texts` 的文字內容重複檢查邏輯已**移除或註解**,圖像哈希去重成為主要的去重機制。
#### LLM 整合 #### LLM 整合
@ -598,6 +599,22 @@ Wolf Chat 是一個基於 MCP (Modular Capability Provider) 框架的聊天機
- **依賴項**Windows 上的控制台事件處理仍然依賴 `pywin32` 套件。如果未安裝,程式會打印警告,關閉時的可靠性可能略有降低(但 `stdio_client` 的正常清理機制應在多數情況下仍然有效)。 - **依賴項**Windows 上的控制台事件處理仍然依賴 `pywin32` 套件。如果未安裝,程式會打印警告,關閉時的可靠性可能略有降低(但 `stdio_client` 的正常清理機制應在多數情況下仍然有效)。
- **效果**:恢復了與 `mcp` 庫的兼容性,同時通過標準的上下文管理和輔助性的 Windows 事件處理,實現了在主程式退出時關閉 MCP 伺服器子進程的目標。 - **效果**:恢復了與 `mcp` 庫的兼容性,同時通過標準的上下文管理和輔助性的 Windows 事件處理,實現了在主程式退出時關閉 MCP 伺服器子進程的目標。
## 最近改進2025-05-12
### 遊戲視窗置頂邏輯修改
- **目的**:將 `game_monitor.py` 中強制遊戲視窗「永遠在最上層」(Always on Top) 的行為,修改為「臨時置頂並獲得焦點」(Bring to Foreground/Activate),以解決原方法僅覆蓋其他視窗的問題。
- **`game_monitor.py`**
- 在 `monitor_game_window` 函數的監控循環中,移除了使用 `win32gui.SetWindowPos``win32con.HWND_TOPMOST` 來檢查和設定 `WS_EX_TOPMOST` 樣式的程式碼。
- 替換為檢查當前前景視窗 (`win32gui.GetForegroundWindow()`) 是否為目標遊戲視窗 (`hwnd`)。
- 如果不是,則嘗試以下步驟將視窗帶到前景並獲得焦點:
1. 使用 `win32gui.SetWindowPos` 搭配 `win32con.HWND_TOP` 旗標,將視窗提升到所有非最上層視窗之上。
2. 呼叫 `win32gui.SetForegroundWindow(hwnd)` 嘗試將視窗設為前景並獲得焦點。
3. 短暫延遲後,檢查視窗是否成功成為前景視窗。
4. 如果 `SetForegroundWindow` 未成功,則嘗試使用 `pygetwindow` 庫提供的 `window.activate()` 方法作為備用方案。
- 更新了相關的日誌訊息以反映新的行為和備用邏輯。
- **效果**:監控腳本現在會使用更全面的方法嘗試將失去焦點的遊戲視窗重新激活並帶到前景,包括備用方案,以提高在不同 Windows 環境下獲取焦點的成功率。這取代了之前僅強制視覺覆蓋的行為。
## 開發建議 ## 開發建議
### 優化方向 ### 優化方向
@ -622,6 +639,43 @@ Wolf Chat 是一個基於 MCP (Modular Capability Provider) 框架的聊天機
- 添加主題識別與記憶功能 - 添加主題識別與記憶功能
- 探索多輪對話中的上下文理解能力 - 探索多輪對話中的上下文理解能力
## 最近改進2025-05-13
### 遊戲監控模組重構
- **目的**:將遊戲監控功能從獨立的 `game_monitor.py` 腳本重構為一個更健壯、更易於管理的 `game_manager.py` 模組,並由 `Setup.py` 統一控制其生命週期和配置。
- **`game_manager.py` (新模組)**
- 創建了 `GameMonitor` 類,封裝了所有遊戲視窗監控、自動重啟和進程管理邏輯。
- 提供了 `create_game_monitor` 工廠函數。
- 支持通過構造函數和 `update_config` 方法進行配置。
- 使用回調函數 (`callback`) 與調用者(即 `Setup.py`)通信,例如在遊戲重啟完成時。
- 保留了獨立運行模式,以便在直接執行時仍能工作(主要用於測試或舊版兼容)。
- 程式碼註解和日誌訊息已更新為英文。
- **新增遊戲崩潰自動恢復 (2025-05-15)**
- 在 `_monitor_loop` 方法中,優先檢查遊戲進程 (`_is_game_running`) 是否仍在運行。
- 如果進程消失,會記錄警告並嘗試重新啟動遊戲 (`_start_game_process`)。
- 新增 `_is_game_running` 方法,使用 `psutil` 檢查具有指定進程名稱的遊戲是否正在運行。
- **`Setup.py` (修改)**
- 導入 `game_manager`
- 在 `WolfChatSetup` 類的 `__init__` 方法中初始化 `self.game_monitor = None`
- 在 `start_managed_session` 方法中:
- 創建 `game_monitor_callback` 函數以處理來自 `GameMonitor` 的動作(特別是 `restart_complete`)。
- 使用 `game_manager.create_game_monitor` 創建 `GameMonitor` 實例。
- 啟動 `GameMonitor`
- 新增 `_handle_game_restart_complete` 方法,用於在收到 `GameMonitor` 的重啟完成回調後,處理機器人的重啟。
- 在 `stop_managed_session` 方法中,調用 `self.game_monitor.stop()` 並釋放實例。
- 修改 `_restart_game_managed` 方法,使其在 `self.game_monitor` 存在且運行時,調用 `self.game_monitor.restart_now()` 來執行遊戲重啟。
- 在 `save_settings` 方法中,如果 `self.game_monitor` 實例存在,則調用其 `update_config` 方法以更新運行時配置。
- **`main.py` (修改)**
- 移除了所有對舊 `game_monitor.py` 的導入、子進程啟動、訊號讀取和生命週期管理相關的程式碼。遊戲監控現在完全由 `Setup.py` 在受管會話模式下處理。
- **舊檔案刪除**
- 刪除了原來的 `game_monitor.py` 文件。
- **效果**
- 遊戲監控邏輯更加內聚和模塊化。
- `Setup.py` 現在完全控制遊戲監控的啟動、停止和配置,簡化了 `main.py` 的職責。
- 通過回調機制實現了更清晰的模塊間通信。
- 提高了程式碼的可維護性和可擴展性。
### 注意事項 ### 注意事項
1. **圖像模板**:確保所有必要的 UI 元素模板都已截圖並放置在 templates 目錄 1. **圖像模板**:確保所有必要的 UI 元素模板都已截圖並放置在 templates 目錄
@ -725,3 +779,42 @@ ClaudeCode.md
# Current Mode # Current Mode
ACT MODE ACT MODE
</environment_details> </environment_details>
</file_content>
Now that you have the latest state of the file, try the operation again with fewer, more precise SEARCH blocks. For large files especially, it may be prudent to try to limit yourself to <5 SEARCH/REPLACE blocks at a time, then wait for the user to respond with the result of the operation before following up with another replace_in_file call to make additional edits.
(If you run into this error 3 times in a row, you may use the write_to_file tool as a fallback.)
</error><environment_details>
# VSCode Visible Files
ClaudeCode.md
# VSCode Open Tabs
config_template.py
test/llm_debug_script.py
llm_interaction.py
wolf_control.py
.gitignore
chroma_client.py
batch_memory_record.py
memory_manager.py
game_monitor.py
game_manager.py
Setup.py
main.py
ClaudeCode.md
reembedding tool.py
config.py
memory_backup.py
tools/chroma_view.py
ui_interaction.py
remote_config.json
# Current Time
5/13/2025, 3:31:34 AM (Asia/Taipei, UTC+8:00)
# Context Window Usage
429,724 / 1,048.576K tokens used (41%)
# Current Mode
ACT MODE
</environment_details>

1413
Setup.py

File diff suppressed because it is too large Load Diff

208
batch_memory_record.py Normal file
View File

@ -0,0 +1,208 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Wolf Chat 批次記憶備份工具
自動掃描chat_logs資料夾針對所有日誌檔案執行記憶備份
"""
import os
import re
import sys
import time
import argparse
import subprocess
import logging
from datetime import datetime
from typing import List, Optional, Tuple
# 設置日誌
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler("batch_backup.log"),
logging.StreamHandler()
]
)
logger = logging.getLogger("BatchMemoryBackup")
def find_log_files(log_dir: str = "chat_logs") -> List[Tuple[str, str]]:
"""
掃描指定目錄找出所有符合YYYY-MM-DD.log格式的日誌文件
返回: [(日期字符串, 文件路徑), ...]按日期排序
"""
date_pattern = re.compile(r'^(\d{4}-\d{2}-\d{2})\.log$')
log_files = []
# 確保目錄存在
if not os.path.exists(log_dir) or not os.path.isdir(log_dir):
logger.error(f"目錄不存在或不是有效目錄: {log_dir}")
return []
# 掃描目錄
for filename in os.listdir(log_dir):
match = date_pattern.match(filename)
if match:
date_str = match.group(1)
file_path = os.path.join(log_dir, filename)
try:
# 驗證日期格式
datetime.strptime(date_str, "%Y-%m-%d")
log_files.append((date_str, file_path))
except ValueError:
logger.warning(f"發現無效的日期格式: {filename}")
# 按日期排序
log_files.sort(key=lambda x: x[0])
return log_files
def process_log_file(date_str: str, backup_script: str = "memory_backup.py") -> bool:
"""
為指定日期的日誌文件執行記憶備份
Parameters:
date_str: 日期字符串格式為YYYY-MM-DD
backup_script: 備份腳本路徑
Returns:
bool: 操作是否成功
"""
logger.info(f"開始處理日期 {date_str} 的日誌")
try:
# 構建命令
cmd = [sys.executable, backup_script, "--backup", "--date", date_str]
# 執行命令
logger.info(f"執行命令: {' '.join(cmd)}")
process = subprocess.run(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
check=False # 不要在命令失敗時拋出異常
)
# 檢查結果
if process.returncode == 0:
logger.info(f"日期 {date_str} 的處理完成")
return True
else:
logger.error(f"處理日期 {date_str} 失敗: {process.stderr}")
return False
except Exception as e:
logger.error(f"處理日期 {date_str} 時發生異常: {str(e)}")
return False
def batch_process(log_dir: str = "chat_logs", backup_script: str = "memory_backup.py",
date_range: Optional[Tuple[str, str]] = None,
wait_seconds: int = 5) -> Tuple[int, int]:
"""
批次處理多個日誌文件
Parameters:
log_dir: 日誌目錄路徑
backup_script: 備份腳本路徑
date_range: (開始日期, 結束日期)用於限制處理範圍格式為YYYY-MM-DD
wait_seconds: 每個文件處理後的等待時間
Returns:
(成功數量, 總數量)
"""
log_files = find_log_files(log_dir)
if not log_files:
logger.warning(f"{log_dir} 中未找到有效的日誌文件")
return (0, 0)
logger.info(f"找到 {len(log_files)} 個日誌文件")
# 如果指定了日期範圍,過濾文件
if date_range:
start_date, end_date = date_range
filtered_files = [(date_str, path) for date_str, path in log_files
if start_date <= date_str <= end_date]
logger.info(f"根據日期範圍 {start_date}{end_date} 過濾後剩餘 {len(filtered_files)} 個文件")
log_files = filtered_files
success_count = 0
total_count = len(log_files)
for i, (date_str, file_path) in enumerate(log_files):
logger.info(f"處理進度: {i+1}/{total_count} - 日期: {date_str}")
if process_log_file(date_str, backup_script):
success_count += 1
# 若不是最後一個文件,等待一段時間再處理下一個
if i < total_count - 1:
logger.info(f"等待 {wait_seconds} 秒後處理下一個文件...")
time.sleep(wait_seconds)
return (success_count, total_count)
def parse_date_arg(date_arg: str) -> Optional[str]:
"""解析日期參數確保格式為YYYY-MM-DD"""
if not date_arg:
return None
try:
parsed_date = datetime.strptime(date_arg, "%Y-%m-%d")
return parsed_date.strftime("%Y-%m-%d")
except ValueError:
logger.error(f"無效的日期格式: {date_arg}請使用YYYY-MM-DD格式")
return None
def main():
parser = argparse.ArgumentParser(description='Wolf Chat 批次記憶備份工具')
parser.add_argument('--log-dir', default='chat_logs', help='日誌檔案目錄,預設為 chat_logs')
parser.add_argument('--script', default='memory_backup.py', help='記憶備份腳本路徑,預設為 memory_backup.py')
parser.add_argument('--start-date', help='開始日期(含),格式為 YYYY-MM-DD')
parser.add_argument('--end-date', help='結束日期(含),格式為 YYYY-MM-DD')
parser.add_argument('--wait', type=int, default=5, help='每個檔案處理間隔時間(秒),預設為 5 秒')
args = parser.parse_args()
# 驗證日期參數
start_date = parse_date_arg(args.start_date)
end_date = parse_date_arg(args.end_date)
# 如果只有一個日期參數,將兩個都設為該日期(僅處理該日期)
if start_date and not end_date:
end_date = start_date
elif end_date and not start_date:
start_date = end_date
date_range = (start_date, end_date) if start_date and end_date else None
logger.info("開始批次記憶備份流程")
logger.info(f"日誌目錄: {args.log_dir}")
logger.info(f"備份腳本: {args.script}")
if date_range:
logger.info(f"日期範圍: {date_range[0]}{date_range[1]}")
else:
logger.info("處理所有找到的日誌檔案")
logger.info(f"等待間隔: {args.wait}")
start_time = time.time()
success, total = batch_process(
log_dir=args.log_dir,
backup_script=args.script,
date_range=date_range,
wait_seconds=args.wait
)
end_time = time.time()
duration = end_time - start_time
logger.info(f"批次處理完成。成功: {success}/{total},耗時: {duration:.2f}")
if success < total:
logger.warning("部分日誌檔案處理失敗,請查看日誌瞭解詳情")
return 1
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@ -47,6 +47,14 @@
"hsv_upper": [107, 255, 255], "hsv_upper": [107, 255, 255],
"min_area": 2500, "min_area": 2500,
"max_area": 300000 "max_area": 300000
},
{
"name": "easter",
"is_bot": false,
"hsv_lower": [5, 154, 183],
"hsv_upper": [29, 255, 255],
"min_area": 2500,
"max_area": 300000
} }
] ]
} }

View File

@ -1,6 +1,7 @@
# chroma_client.py # chroma_client.py
import chromadb import chromadb
from chromadb.config import Settings from chromadb.config import Settings
from chromadb.utils import embedding_functions # New import
import os import os
import json import json
import config import config
@ -10,6 +11,33 @@ import time
_client = None _client = None
_collections = {} _collections = {}
# Global embedding function variable
_embedding_function = None
def get_embedding_function():
"""Gets or creates the embedding function based on config"""
global _embedding_function
if _embedding_function is None:
# Default to paraphrase-multilingual-mpnet-base-v2 if not specified or on error
model_name = getattr(config, 'EMBEDDING_MODEL_NAME', "sentence-transformers/paraphrase-multilingual-mpnet-base-v2")
try:
_embedding_function = embedding_functions.SentenceTransformerEmbeddingFunction(model_name=model_name)
print(f"Successfully initialized embedding function with model: {model_name}")
except Exception as e:
print(f"Failed to initialize embedding function with model '{model_name}': {e}")
# Fallback to default if specified model fails and it's not already the default
if model_name != "sentence-transformers/paraphrase-multilingual-mpnet-base-v2":
print("Falling back to default embedding model: sentence-transformers/paraphrase-multilingual-mpnet-base-v2")
try:
_embedding_function = embedding_functions.SentenceTransformerEmbeddingFunction(model_name="sentence-transformers/paraphrase-multilingual-mpnet-base-v2")
print(f"Successfully initialized embedding function with default model.")
except Exception as e_default:
print(f"Failed to initialize default embedding function: {e_default}")
_embedding_function = None # Ensure it's None if all attempts fail
else:
_embedding_function = None # Ensure it's None if default model also fails
return _embedding_function
def initialize_chroma_client(): def initialize_chroma_client():
"""Initializes and connects to ChromaDB""" """Initializes and connects to ChromaDB"""
global _client global _client
@ -34,13 +62,31 @@ def get_collection(collection_name):
if collection_name not in _collections: if collection_name not in _collections:
try: try:
emb_func = get_embedding_function()
if emb_func is None:
print(f"Failed to get or create collection '{collection_name}' due to embedding function initialization failure.")
return None
_collections[collection_name] = _client.get_or_create_collection( _collections[collection_name] = _client.get_or_create_collection(
name=collection_name name=collection_name,
embedding_function=emb_func
) )
print(f"Successfully got or created collection '{collection_name}'") print(f"Successfully got or created collection '{collection_name}' using configured embedding function.")
except Exception as e: except Exception as e:
print(f"Failed to get collection '{collection_name}': {e}") print(f"Failed to get collection '{collection_name}' with configured embedding function: {e}")
return None # Attempt to create collection with default embedding function as a fallback
print(f"Attempting to create collection '{collection_name}' with default embedding function...")
try:
# Ensure we try the absolute default if the configured one (even if it was the default) failed
default_emb_func = embedding_functions.SentenceTransformerEmbeddingFunction(model_name="sentence-transformers/paraphrase-multilingual-mpnet-base-v2")
_collections[collection_name] = _client.get_or_create_collection(
name=collection_name,
embedding_function=default_emb_func
)
print(f"Successfully got or created collection '{collection_name}' with default embedding function after initial failure.")
except Exception as e_default:
print(f"Failed to get collection '{collection_name}' even with default embedding function: {e_default}")
return None
return _collections[collection_name] return _collections[collection_name]

664
game_manager.py Normal file
View File

@ -0,0 +1,664 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Game Manager Module
Provides game window monitoring, automatic restart, and process management features.
Designed to be imported and controlled by setup.py or other management scripts.
"""
import os
import sys
import time
import json
import threading
import subprocess
import logging
import pygetwindow as gw
# Attempt to import platform-specific modules that might be needed
try:
import win32gui
import win32con
HAS_WIN32 = True
except ImportError:
HAS_WIN32 = False
print("Warning: win32gui/win32con modules not installed, some window management features may be unavailable")
try:
import psutil
HAS_PSUTIL = True
except ImportError:
HAS_PSUTIL = False
print("Warning: psutil module not installed, process management features may be unavailable")
class GameMonitor:
"""
Game window monitoring class.
Responsible for monitoring game window position, scheduled restarts, and providing window management functions.
"""
def __init__(self, config_data, remote_data=None, logger=None, callback=None):
# Use the provided logger or create a new one
self.logger = logger or logging.getLogger("GameMonitor")
if not self.logger.handlers:
handler = logging.StreamHandler()
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
self.logger.addHandler(handler)
self.logger.setLevel(logging.INFO)
self.config_data = config_data
self.remote_data = remote_data or {}
self.callback = callback # Callback function to notify the caller
# Read settings from configuration
self.window_title = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("WINDOW_TITLE", "Last War-Survival Game")
self.enable_restart = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("ENABLE_SCHEDULED_RESTART", True)
self.restart_interval = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("RESTART_INTERVAL_MINUTES", 60)
self.game_path = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_EXECUTABLE_PATH", "")
self.window_x = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_X", 50)
self.window_y = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_Y", 30)
self.window_width = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_WIDTH", 600)
self.window_height = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_HEIGHT", 1070)
self.monitor_interval = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("MONITOR_INTERVAL_SECONDS", 5)
# Read game process name from remote_data, use default if not found
self.game_process_name = self.remote_data.get("GAME_PROCESS_NAME", "LastWar.exe")
# Internal state
self.running = False
self.next_restart_time = None
self.monitor_thread = None
self.stop_event = threading.Event()
# Add these tracking variables
self.last_focus_failure_count = 0
self.last_successful_foreground = time.time()
self.logger.info(f"GameMonitor initialized. Game window: '{self.window_title}', Process: '{self.game_process_name}'")
self.logger.info(f"Position: ({self.window_x}, {self.window_y}), Size: {self.window_width}x{self.window_height}")
self.logger.info(f"Scheduled Restart: {'Enabled' if self.enable_restart else 'Disabled'}, Interval: {self.restart_interval} minutes")
def start(self):
"""Start game window monitoring"""
if self.running:
self.logger.info("Game window monitoring is already running")
return True # Return True if already running
self.logger.info("Starting game window monitoring...")
self.stop_event.clear()
# Set next restart time
if self.enable_restart and self.restart_interval > 0:
self.next_restart_time = time.time() + (self.restart_interval * 60)
self.logger.info(f"Scheduled restart enabled. First restart in {self.restart_interval} minutes")
else:
self.next_restart_time = None
self.logger.info("Scheduled restart is disabled")
# Start monitoring thread
self.monitor_thread = threading.Thread(target=self._monitor_loop, daemon=True)
self.monitor_thread.start()
self.running = True
self.logger.info("Game window monitoring started")
return True
def stop(self):
"""Stop game window monitoring"""
if not self.running:
self.logger.info("Game window monitoring is not running")
return True # Return True if already stopped
self.logger.info("Stopping game window monitoring...")
self.stop_event.set()
# Wait for monitoring thread to finish
if self.monitor_thread and self.monitor_thread.is_alive():
self.logger.info("Waiting for monitoring thread to finish...")
self.monitor_thread.join(timeout=5)
if self.monitor_thread.is_alive():
self.logger.warning("Game window monitoring thread did not stop within the timeout period")
self.running = False
self.monitor_thread = None
self.logger.info("Game window monitoring stopped")
return True
def _monitor_loop(self):
"""Main monitoring loop"""
self.logger.info("Game window monitoring loop started")
last_adjustment_message = "" # Avoid logging repetitive adjustment messages
while not self.stop_event.is_set():
try:
# Add to _monitor_loop method - just 7 lines that matter
if not self._is_game_running():
self.logger.warning("Game process disappeared - restarting")
time.sleep(2) # Let resources release
if self._start_game_process():
self.logger.info("Game restarted successfully")
else:
self.logger.error("Game restart failed")
time.sleep(self.monitor_interval) # Wait before next check after a restart attempt
continue
# Check for scheduled restart
if self.next_restart_time and time.time() >= self.next_restart_time:
self.logger.info("Scheduled restart time reached. Performing restart...")
self._perform_restart()
# Reset next restart time
self.next_restart_time = time.time() + (self.restart_interval * 60)
self.logger.info(f"Restart timer reset. Next restart in {self.restart_interval} minutes")
# Continue to next loop iteration
time.sleep(self.monitor_interval)
continue
# Find game window
window = self._find_game_window()
adjustment_made = False
current_message = ""
if window:
try:
# Use win32gui functions only on Windows
if HAS_WIN32:
# Get window handle
hwnd = window._hWnd
# 1. Check and adjust position/size
current_pos = (window.left, window.top)
current_size = (window.width, window.height)
target_pos = (self.window_x, self.window_y)
target_size = (self.window_width, self.window_height)
if current_pos != target_pos or current_size != target_size:
window.moveTo(target_pos[0], target_pos[1])
window.resizeTo(target_size[0], target_size[1])
time.sleep(0.1)
window.activate()
time.sleep(0.1)
# Check if changes were successful
new_pos = (window.left, window.top)
new_size = (window.width, window.height)
if new_pos == target_pos and new_size == target_size:
current_message += f"Adjusted window position/size. "
adjustment_made = True
# 2. Check and bring to foreground using enhanced method
current_foreground_hwnd = win32gui.GetForegroundWindow()
if current_foreground_hwnd != hwnd:
# Use enhanced forceful focus method
success, method_used = self._force_window_foreground(hwnd, window)
if success:
current_message += f"Focused window using {method_used}. "
adjustment_made = True
if not hasattr(self, 'last_focus_failure_count'):
self.last_focus_failure_count = 0
self.last_focus_failure_count = 0
else:
# Increment failure counter
if not hasattr(self, 'last_focus_failure_count'):
self.last_focus_failure_count = 0
self.last_focus_failure_count += 1
# Log warning with consecutive failure count
self.logger.warning(f"Window focus failed (attempt {self.last_focus_failure_count}): {method_used}")
# Restart game after too many failures
if self.last_focus_failure_count >= 15:
self.logger.warning("Excessive focus failures, restarting game...")
self._perform_restart()
self.last_focus_failure_count = 0
else:
# Use basic functions on non-Windows platforms
current_pos = (window.left, window.top)
current_size = (window.width, window.height)
target_pos = (self.window_x, self.window_y)
target_size = (self.window_width, self.window_height)
if current_pos != target_pos or current_size != target_size:
window.moveTo(target_pos[0], target_pos[1])
window.resizeTo(target_size[0], target_size[1])
current_message += f"Adjusted game window to position {target_pos} size {target_size[0]}x{target_size[1]}. "
adjustment_made = True
# Try activating the window (may have limited effect on non-Windows)
try:
window.activate()
current_message += "Attempted to activate game window. "
adjustment_made = True
except Exception as activate_err:
self.logger.warning(f"Error activating window: {activate_err}")
except Exception as e:
self.logger.error(f"Unexpected error while monitoring game window: {e}")
# Log only if adjustments were made and the message changed
if adjustment_made and current_message and current_message != last_adjustment_message:
self.logger.info(f"[GameMonitor] {current_message.strip()}")
last_adjustment_message = current_message
elif not window:
# Reset last message if window disappears
last_adjustment_message = ""
except Exception as e:
self.logger.error(f"Error in monitoring loop: {e}")
# Wait for the next check
time.sleep(self.monitor_interval)
self.logger.info("Game window monitoring loop finished")
def _is_game_running(self):
"""Check if game is running"""
if not HAS_PSUTIL:
self.logger.warning("_is_game_running: psutil not available, cannot check process status.")
return True # Assume running if psutil is not available to avoid unintended restarts
try:
return any(p.name().lower() == self.game_process_name.lower() for p in psutil.process_iter(['name']))
except Exception as e:
self.logger.error(f"Error checking game process: {e}")
return False # Assume not running on error
def _find_game_window(self):
"""Find the game window with the specified title"""
try:
windows = gw.getWindowsWithTitle(self.window_title)
if windows:
return windows[0]
except Exception as e:
self.logger.debug(f"Error finding game window: {e}")
return None
def _force_window_foreground(self, hwnd, window):
"""Aggressive window focus implementation"""
if not HAS_WIN32:
return False, "win32 modules unavailable"
success = False
methods_tried = []
# Method 1: HWND_TOPMOST strategy
methods_tried.append("HWND_TOPMOST")
try:
win32gui.SetWindowPos(hwnd, win32con.HWND_TOPMOST, 0, 0, 0, 0,
win32con.SWP_NOMOVE | win32con.SWP_NOSIZE)
time.sleep(0.1)
win32gui.SetWindowPos(hwnd, win32con.HWND_TOP, 0, 0, 0, 0,
win32con.SWP_NOMOVE | win32con.SWP_NOSIZE)
win32gui.SetForegroundWindow(hwnd)
time.sleep(0.2)
if win32gui.GetForegroundWindow() == hwnd:
return True, "HWND_TOPMOST"
except Exception as e:
self.logger.debug(f"Method 1 failed: {e}")
# Method 2: Minimize/restore cycle
methods_tried.append("MinimizeRestore")
try:
win32gui.ShowWindow(hwnd, win32con.SW_MINIMIZE)
time.sleep(0.3)
win32gui.ShowWindow(hwnd, win32con.SW_RESTORE)
time.sleep(0.2)
win32gui.SetForegroundWindow(hwnd)
if win32gui.GetForegroundWindow() == hwnd:
return True, "MinimizeRestore"
except Exception as e:
self.logger.debug(f"Method 2 failed: {e}")
# Method 3: Thread input attach
methods_tried.append("ThreadAttach")
try:
import win32process
import win32api
current_thread_id = win32api.GetCurrentThreadId()
window_thread_id = win32process.GetWindowThreadProcessId(hwnd)[0]
if current_thread_id != window_thread_id:
win32process.AttachThreadInput(current_thread_id, window_thread_id, True)
try:
win32gui.BringWindowToTop(hwnd)
win32gui.SetForegroundWindow(hwnd)
time.sleep(0.2)
if win32gui.GetForegroundWindow() == hwnd:
return True, "ThreadAttach"
finally:
win32process.AttachThreadInput(current_thread_id, window_thread_id, False)
except Exception as e:
self.logger.debug(f"Method 3 failed: {e}")
# Method 4: Flash + Window messages
methods_tried.append("Flash+Messages")
try:
# First flash to get attention
win32gui.FlashWindow(hwnd, True)
time.sleep(0.2)
# Then send specific window messages
win32gui.SendMessage(hwnd, win32con.WM_SETREDRAW, 0, 0)
win32gui.SendMessage(hwnd, win32con.WM_SETREDRAW, 1, 0)
win32gui.RedrawWindow(hwnd, None, None,
win32con.RDW_FRAME | win32con.RDW_INVALIDATE |
win32con.RDW_UPDATENOW | win32con.RDW_ALLCHILDREN)
win32gui.PostMessage(hwnd, win32con.WM_SYSCOMMAND, win32con.SC_RESTORE, 0)
win32gui.PostMessage(hwnd, win32con.WM_ACTIVATE, win32con.WA_ACTIVE, 0)
time.sleep(0.2)
if win32gui.GetForegroundWindow() == hwnd:
return True, "Flash+Messages"
except Exception as e:
self.logger.debug(f"Method 4 failed: {e}")
# Method 5: Hide/Show cycle
methods_tried.append("HideShow")
try:
win32gui.ShowWindow(hwnd, win32con.SW_HIDE)
time.sleep(0.2)
win32gui.ShowWindow(hwnd, win32con.SW_SHOW)
time.sleep(0.2)
win32gui.SetForegroundWindow(hwnd)
if win32gui.GetForegroundWindow() == hwnd:
return True, "HideShow"
except Exception as e:
self.logger.debug(f"Method 5 failed: {e}")
return False, f"All methods failed: {', '.join(methods_tried)}"
def _find_game_process_by_window(self):
"""Find process using both window title and process name"""
if not HAS_PSUTIL or not HAS_WIN32:
return None
try:
window = self._find_game_window()
if not window:
return None
hwnd = window._hWnd
window_pid = None
try:
import win32process
_, window_pid = win32process.GetWindowThreadProcessId(hwnd)
except Exception:
return None
if window_pid:
try:
proc = psutil.Process(window_pid)
proc_name = proc.name()
if proc_name.lower() == self.game_process_name.lower():
self.logger.info(f"Found game process '{proc_name}' (PID: {proc.pid}) with window title '{self.window_title}'")
return proc
else:
self.logger.debug(f"Window process name mismatch: expected '{self.game_process_name}', got '{proc_name}'")
return proc # Returning proc even if name mismatches, as per user's code.
except Exception:
pass
# Fallback to name-based search if window-based fails or PID doesn't match process name.
# The user's provided code implies a fallback to _find_game_process_by_name()
# This will be handled by the updated _find_game_process method.
# For now, if the window PID didn't lead to a matching process name, we return None here.
# The original code had "return self._find_game_process_by_name()" here,
# but that would create a direct dependency. The new _find_game_process handles the fallback.
# So, if we reach here, it means the window was found, PID was obtained, but process name didn't match.
# The original code returns `proc` even on mismatch, so I'll keep that.
# If `window_pid` was None or `psutil.Process(window_pid)` failed, it would have returned None or passed.
# The logic "return self._find_game_process_by_name()" was in the original snippet,
# I will include it here as per the snippet, but note that the overall _find_game_process will also call it.
return self._find_game_process_by_name() # As per user snippet
except Exception as e:
self.logger.error(f"Process-by-window lookup error: {e}")
return None
def _find_game_process(self):
"""Find game process with combined approach"""
# Try window-based process lookup first
proc = self._find_game_process_by_window()
if proc:
return proc
# Fall back to name-only lookup
# This is the original _find_game_process logic, now as a fallback.
if not HAS_PSUTIL:
self.logger.debug("psutil not available for name-only process lookup fallback.") # Changed to debug as primary is window based
return None
try:
for p_iter in psutil.process_iter(['pid', 'name', 'exe']):
try:
proc_info = p_iter.info
proc_name = proc_info.get('name')
if proc_name and proc_name.lower() == self.game_process_name.lower():
self.logger.info(f"Found game process by name '{proc_name}' (PID: {p_iter.pid}) as fallback")
return p_iter
except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess):
continue
except Exception as e:
self.logger.error(f"Error in name-only game process lookup: {e}")
self.logger.info(f"Game process '{self.game_process_name}' not found by name either.")
return None
def _perform_restart(self):
"""Execute the game restart process"""
self.logger.info("Starting game restart process")
try:
# 1. Notify that restart has begun (optional)
if self.callback:
self.callback("restart_begin")
# 2. Terminate existing game process
self._terminate_game_process()
time.sleep(2) # Short wait to ensure process termination
# 3. Start new game process
if self._start_game_process():
self.logger.info("Game restarted successfully")
else:
self.logger.error("Failed to start game")
# 4. Wait for game to launch
restart_wait_time = 45 # seconds, increased from 30
self.logger.info(f"Waiting for game to start ({restart_wait_time} seconds)...")
time.sleep(restart_wait_time)
# 5. Notify restart completion
self.logger.info("Game restart process completed, sending notification")
if self.callback:
self.callback("restart_complete")
return True
except Exception as e:
self.logger.error(f"Error during game restart process: {e}")
# Attempt to notify error
if self.callback:
self.callback("restart_error")
return False
def _terminate_game_process(self):
"""Terminate the game process"""
self.logger.info(f"Attempting to terminate game process '{self.game_process_name}'")
if not HAS_PSUTIL:
self.logger.warning("psutil is not available, cannot terminate process")
return False
process = self._find_game_process()
terminated = False
if process:
try:
self.logger.info(f"Found game process PID: {process.pid}, terminating...")
process.terminate()
try:
process.wait(timeout=5)
self.logger.info(f"Process {process.pid} terminated successfully (terminate)")
terminated = True
except psutil.TimeoutExpired:
self.logger.warning(f"Process {process.pid} did not terminate within 5s (terminate), attempting force kill")
process.kill()
process.wait(timeout=5)
self.logger.info(f"Process {process.pid} force killed (kill)")
terminated = True
except Exception as e:
self.logger.error(f"Error terminating process: {e}")
else:
self.logger.warning(f"No running process found with name '{self.game_process_name}'")
return terminated
def _start_game_process(self):
"""Start the game process"""
if not self.game_path:
self.logger.error("Game executable path not set, cannot start")
return False
self.logger.info(f"Starting game: {self.game_path}")
try:
if sys.platform == "win32":
os.startfile(self.game_path)
self.logger.info("Called os.startfile to launch game")
return True
else:
# Use subprocess.Popen for non-Windows platforms
# Ensure it runs detached if possible, or handle appropriately
subprocess.Popen([self.game_path], start_new_session=True) # Attempt detached start
self.logger.info("Called subprocess.Popen to launch game")
return True
except FileNotFoundError:
self.logger.error(f"Startup error: Game launcher '{self.game_path}' not found")
except OSError as ose:
self.logger.error(f"Startup error (OSError): {ose} - Check path and permissions", exc_info=True)
except Exception as e:
self.logger.error(f"Unexpected error starting game: {e}", exc_info=True)
return False
def restart_now(self):
"""Perform an immediate restart"""
self.logger.info("Manually triggering game restart")
result = self._perform_restart()
# Reset the timer if scheduled restart is enabled
if self.enable_restart and self.restart_interval > 0:
self.next_restart_time = time.time() + (self.restart_interval * 60)
self.logger.info(f"Restart timer reset. Next restart in {self.restart_interval} minutes")
return result
def update_config(self, config_data=None, remote_data=None):
"""Update configuration settings"""
if config_data:
old_config = self.config_data
self.config_data = config_data
# Update key settings
self.window_title = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("WINDOW_TITLE", self.window_title)
self.enable_restart = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("ENABLE_SCHEDULED_RESTART", self.enable_restart)
self.restart_interval = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("RESTART_INTERVAL_MINUTES", self.restart_interval)
self.game_path = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_EXECUTABLE_PATH", self.game_path)
self.window_x = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_X", self.window_x)
self.window_y = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_Y", self.window_y)
self.window_width = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_WIDTH", self.window_width)
self.window_height = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_HEIGHT", self.window_height)
self.monitor_interval = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("MONITOR_INTERVAL_SECONDS", self.monitor_interval)
# Reset scheduled restart timer if parameters changed
if self.running and self.enable_restart and self.restart_interval > 0:
old_interval = old_config.get("GAME_WINDOW_CONFIG", {}).get("RESTART_INTERVAL_MINUTES", 60)
if self.restart_interval != old_interval:
self.next_restart_time = time.time() + (self.restart_interval * 60)
self.logger.info(f"Restart interval updated to {self.restart_interval} minutes, next restart reset")
if remote_data:
self.remote_data = remote_data
old_process_name = self.game_process_name
self.game_process_name = self.remote_data.get("GAME_PROCESS_NAME", old_process_name)
if self.game_process_name != old_process_name:
self.logger.info(f"Game process name updated to '{self.game_process_name}'")
self.logger.info("GameMonitor configuration updated")
# Provide simple external API functions
def create_game_monitor(config_data, remote_data=None, logger=None, callback=None):
"""Create a game monitor instance"""
return GameMonitor(config_data, remote_data, logger, callback)
def stop_all_monitors():
"""Attempt to stop all created monitors (global cleanup)"""
# This function could be implemented if instance references are stored.
# In the current design, each monitor needs to be stopped individually.
pass
# Functionality when run standalone (similar to original game_monitor.py)
if __name__ == "__main__":
# Set up basic logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger("GameManagerStandalone")
# Load settings from config.py
try:
import config
logger.info("Loaded config.py")
# Build basic configuration dictionary
config_data = {
"GAME_WINDOW_CONFIG": {
"WINDOW_TITLE": config.WINDOW_TITLE,
"ENABLE_SCHEDULED_RESTART": config.ENABLE_SCHEDULED_RESTART,
"RESTART_INTERVAL_MINUTES": config.RESTART_INTERVAL_MINUTES,
"GAME_EXECUTABLE_PATH": config.GAME_EXECUTABLE_PATH,
"GAME_WINDOW_X": config.GAME_WINDOW_X,
"GAME_WINDOW_Y": config.GAME_WINDOW_Y,
"GAME_WINDOW_WIDTH": config.GAME_WINDOW_WIDTH,
"GAME_WINDOW_HEIGHT": config.GAME_WINDOW_HEIGHT,
"MONITOR_INTERVAL_SECONDS": config.MONITOR_INTERVAL_SECONDS
}
}
# Define a callback for standalone execution
def standalone_callback(action):
"""Send JSON signal via standard output"""
logger.info(f"Sending signal: {action}")
signal_data = {'action': action}
try:
json_signal = json.dumps(signal_data)
print(json_signal, flush=True)
logger.info(f"Signal sent: {action}")
except Exception as e:
logger.error(f"Failed to send signal '{action}': {e}")
# Create and start the monitor
monitor = GameMonitor(config_data, logger=logger, callback=standalone_callback)
monitor.start()
# Keep the program running
try:
logger.info("Game monitoring started. Press Ctrl+C to stop.")
while True:
time.sleep(1)
except KeyboardInterrupt:
logger.info("Ctrl+C received, stopping...")
finally:
monitor.stop()
logger.info("Game monitoring stopped")
except ImportError:
logger.error("Could not load config.py. Ensure it exists and contains necessary settings.")
sys.exit(1)
except Exception as e:
logger.error(f"Error starting game monitoring: {e}", exc_info=True)
sys.exit(1)

View File

@ -1,284 +0,0 @@
#!/usr/bin/env python
"""
Game Window Monitor Module
Continuously monitors the game window specified in the config,
ensuring it stays at the configured position, size, and remains topmost.
"""
import time
import datetime # Added
import subprocess # Added
import psutil # Added
import sys # Added
import json # Added
import os # Added for basename
import pygetwindow as gw
import win32gui
import win32con
import config
import logging
# import multiprocessing # Keep for Pipe/Queue if needed later, though using stdio now
# NOTE: config.py should handle dotenv loading. This script only imports values.
# --- Setup Logging ---
monitor_logger = logging.getLogger('GameMonitor')
monitor_logger.setLevel(logging.INFO) # Set level for the logger
log_formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
# Create handler for stderr
stderr_handler = logging.StreamHandler(sys.stderr) # Explicitly use stderr
stderr_handler.setFormatter(log_formatter)
# Add handler to the logger
if not monitor_logger.hasHandlers(): # Avoid adding multiple handlers if run multiple times
monitor_logger.addHandler(stderr_handler)
monitor_logger.propagate = False # Prevent propagation to root logger if basicConfig was called elsewhere
# --- Helper Functions ---
def restart_game_process():
"""Finds and terminates the existing game process, then restarts it."""
monitor_logger.info("嘗試重啟遊戲進程。(Attempting to restart game process.)")
game_path = config.GAME_EXECUTABLE_PATH
if not game_path or not os.path.exists(os.path.dirname(game_path)): # Basic check
monitor_logger.error(f"遊戲執行檔路徑 '{game_path}' 無效或目錄不存在,無法重啟。(Game executable path '{game_path}' is invalid or directory does not exist, cannot restart.)")
return
target_process_name = "LastWar.exe" # Correct process name
launcher_path = config.GAME_EXECUTABLE_PATH # Keep launcher path for restarting
monitor_logger.info(f"尋找名稱為 '{target_process_name}' 的遊戲進程。(Looking for game process named '{target_process_name}')")
terminated = False
process_found = False
for proc in psutil.process_iter(['pid', 'name', 'exe']):
try:
proc_info = proc.info
proc_name = proc_info.get('name')
if proc_name == target_process_name:
process_found = True
monitor_logger.info(f"找到遊戲進程 PID: {proc_info['pid']},名稱: {proc_name}。正在終止...(Found game process PID: {proc_info['pid']}, Name: {proc_name}. Terminating...)")
proc.terminate()
try:
proc.wait(timeout=5)
monitor_logger.info(f"進程 {proc_info['pid']} 已成功終止 (terminate)。(Process {proc_info['pid']} terminated successfully (terminate).)")
terminated = True
except psutil.TimeoutExpired:
monitor_logger.warning(f"進程 {proc_info['pid']} 未能在 5 秒內終止 (terminate),嘗試強制結束 (kill)。(Process {proc_info['pid']} did not terminate in 5s (terminate), attempting kill.)")
proc.kill()
proc.wait(timeout=5) # Wait for kill with timeout
monitor_logger.info(f"進程 {proc_info['pid']} 已強制結束 (kill)。(Process {proc_info['pid']} killed.)")
terminated = True
except Exception as wait_kill_err:
monitor_logger.error(f"等待進程 {proc_info['pid']} 強制結束時出錯: {wait_kill_err}", exc_info=False)
# Removed Termination Verification - Rely on main loop for eventual state correction
monitor_logger.info(f"已處理匹配的進程 PID: {proc_info['pid']},停止搜索。(Processed matching process PID: {proc_info['pid']}, stopping search.)")
break # Exit the loop once a process is handled
except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess):
pass # Process might have already exited, access denied, or is a zombie
except Exception as e:
pid_str = proc.pid if hasattr(proc, 'pid') else 'N/A'
monitor_logger.error(f"檢查或終止進程 PID:{pid_str} 時出錯: {e}", exc_info=False)
if process_found and not terminated:
monitor_logger.error("找到遊戲進程但未能成功終止它。(Found game process but failed to terminate it successfully.)")
elif not process_found:
monitor_logger.warning(f"未找到名稱為 '{target_process_name}' 的正在運行的進程。(No running process named '{target_process_name}' was found.)")
# Wait a moment before restarting, use the launcher path from config
time.sleep(2)
if not launcher_path or not os.path.exists(os.path.dirname(launcher_path)):
monitor_logger.error(f"遊戲啟動器路徑 '{launcher_path}' 無效或目錄不存在,無法啟動。(Game launcher path '{launcher_path}' is invalid or directory does not exist, cannot launch.)")
return
monitor_logger.info(f"正在使用啟動器啟動遊戲: {launcher_path} (Launching game using launcher: {launcher_path})")
try:
if sys.platform == "win32":
os.startfile(launcher_path)
monitor_logger.info("已調用 os.startfile 啟動遊戲。(os.startfile called to launch game.)")
else:
subprocess.Popen([launcher_path])
monitor_logger.info("已調用 subprocess.Popen 啟動遊戲。(subprocess.Popen called to launch game.)")
except FileNotFoundError:
monitor_logger.error(f"啟動錯誤:找不到遊戲啟動器 '{launcher_path}'。(Launch Error: Game launcher not found at '{launcher_path}'.)")
except OSError as ose:
monitor_logger.error(f"啟動錯誤 (OSError): {ose} - 檢查路徑和權限。(Launch Error (OSError): {ose} - Check path and permissions.)", exc_info=True)
except Exception as e:
monitor_logger.error(f"啟動遊戲時發生未預期錯誤: {e}", exc_info=True)
# Don't return False here, let the process continue to send resume signal
# Removed Startup Verification - Rely on main loop for eventual state correction
# Always return True (or nothing) to indicate the attempt was made
return # Or return True, doesn't matter much now
def perform_scheduled_restart():
"""Handles the sequence of pausing UI, restarting game, resuming UI."""
monitor_logger.info("開始執行定時重啟流程。(Starting scheduled restart sequence.)")
# Removed pause_ui signal - UI will handle its own pause/resume based on restart_complete
try:
# 1. Attempt to restart the game (no verification)
monitor_logger.info("嘗試執行遊戲重啟。(Attempting game restart process.)")
restart_game_process() # Fire and forget restart attempt
monitor_logger.info("遊戲重啟嘗試已執行。(Game restart attempt executed.)")
# 2. Wait fixed time after restart attempt
monitor_logger.info("等待 30 秒讓遊戲啟動(無驗證)。(Waiting 30 seconds for game to launch (no verification)...)")
time.sleep(30) # Fixed wait
except Exception as restart_err:
monitor_logger.error(f"執行 restart_game_process 時發生未預期錯誤: {restart_err}", exc_info=True)
# Continue to finally block even on error
finally:
# 3. Signal main process that restart attempt is complete via stdout
monitor_logger.info("發送重啟完成訊號。(Sending restart complete signal.)")
restart_complete_signal_data = {'action': 'restart_complete'}
try:
json_signal = json.dumps(restart_complete_signal_data)
print(json_signal, flush=True)
monitor_logger.info("已發送重啟完成訊號。(Sent restart complete signal.)")
except Exception as e:
monitor_logger.error(f"發送重啟完成訊號 '{json_signal}' 失敗: {e}", exc_info=True) # Log signal data on error
monitor_logger.info("定時重啟流程(包括 finally 塊)執行完畢。(Scheduled restart sequence (including finally block) finished.)")
# Configure logger (basic example, adjust as needed)
# (Logging setup moved earlier)
def find_game_window(title=config.WINDOW_TITLE):
"""Attempts to find the game window by its title."""
try:
windows = gw.getWindowsWithTitle(title)
if windows:
return windows[0]
except Exception as e:
# Log errors if a logger was configured
# monitor_logger.error(f"Error finding window '{title}': {e}")
pass # Keep silent if window not found during normal check
return None
def monitor_game_window():
"""The main monitoring loop. Now runs directly, not in a thread."""
monitor_logger.info("遊戲視窗監控腳本已啟動。(Game window monitoring script started.)")
last_adjustment_message = "" # Track last message to avoid spam
next_restart_time = None
# Initialize scheduled restart timer if enabled
if config.ENABLE_SCHEDULED_RESTART and config.RESTART_INTERVAL_MINUTES > 0:
interval_seconds = config.RESTART_INTERVAL_MINUTES * 60
next_restart_time = time.time() + interval_seconds
monitor_logger.info(f"已啟用定時重啟,首次重啟將在 {config.RESTART_INTERVAL_MINUTES} 分鐘後執行。(Scheduled restart enabled. First restart in {config.RESTART_INTERVAL_MINUTES} minutes.)")
else:
monitor_logger.info("未啟用定時重啟功能。(Scheduled restart is disabled.)")
while True: # Run indefinitely until terminated externally
# --- Scheduled Restart Check ---
if next_restart_time and time.time() >= next_restart_time:
monitor_logger.info("到達預定重啟時間。(Scheduled restart time reached.)")
perform_scheduled_restart()
# Reset timer for the next interval
interval_seconds = config.RESTART_INTERVAL_MINUTES * 60
next_restart_time = time.time() + interval_seconds
monitor_logger.info(f"重啟計時器已重置,下次重啟將在 {config.RESTART_INTERVAL_MINUTES} 分鐘後執行。(Restart timer reset. Next restart in {config.RESTART_INTERVAL_MINUTES} minutes.)")
# Continue to next loop iteration after restart sequence
time.sleep(config.MONITOR_INTERVAL_SECONDS) # Add a small delay before next check
continue
# --- Regular Window Monitoring ---
window = find_game_window()
adjustment_made = False
current_message = ""
if window:
try:
hwnd = window._hWnd # Get the window handle for win32 functions
# 1. Check and Adjust Position/Size
current_pos = (window.left, window.top)
current_size = (window.width, window.height)
target_pos = (config.GAME_WINDOW_X, config.GAME_WINDOW_Y)
target_size = (config.GAME_WINDOW_WIDTH, config.GAME_WINDOW_HEIGHT)
if current_pos != target_pos or current_size != target_size:
window.moveTo(target_pos[0], target_pos[1])
window.resizeTo(target_size[0], target_size[1])
# Verify if move/resize was successful before logging
time.sleep(0.1) # Give window time to adjust
window.activate() # Bring window to foreground before checking again
time.sleep(0.1)
new_pos = (window.left, window.top)
new_size = (window.width, window.height)
if new_pos == target_pos and new_size == target_size:
current_message += f"已將遊戲視窗調整至位置 ({target_pos[0]},{target_pos[1]}) 大小 {target_size[0]}x{target_size[1]}。(Adjusted game window to position {target_pos} size {target_size}.) "
adjustment_made = True
else:
# Log failure if needed
# monitor_logger.warning(f"Failed to adjust window. Current: {new_pos} {new_size}, Target: {target_pos} {target_size}")
pass # Keep silent on failure for now
# 2. Check and Set Topmost
style = win32gui.GetWindowLong(hwnd, win32con.GWL_EXSTYLE)
is_topmost = style & win32con.WS_EX_TOPMOST
if not is_topmost:
# Set topmost, -1 for HWND_TOPMOST, flags = SWP_NOMOVE | SWP_NOSIZE
win32gui.SetWindowPos(hwnd, win32con.HWND_TOPMOST, 0, 0, 0, 0,
win32con.SWP_NOMOVE | win32con.SWP_NOSIZE)
# Verify
time.sleep(0.1)
new_style = win32gui.GetWindowLong(hwnd, win32con.GWL_EXSTYLE)
if new_style & win32con.WS_EX_TOPMOST:
current_message += "已將遊戲視窗設為最上層。(Set game window to topmost.)"
adjustment_made = True
else:
# Log failure if needed
# monitor_logger.warning("Failed to set window to topmost.")
pass # Keep silent
except gw.PyGetWindowException as e:
# Log PyGetWindowException specifically, might indicate window closed during check
monitor_logger.warning(f"監控循環中無法訪問視窗屬性 (可能已關閉): {e} (Could not access window properties in monitor loop (may be closed): {e})")
except Exception as e:
# Log other exceptions during monitoring
monitor_logger.error(f"監控遊戲視窗時發生未預期錯誤: {e} (Unexpected error during game window monitoring: {e})", exc_info=True)
# Log adjustment message only if an adjustment was made and it's different from the last one
# This should NOT print JSON signals
if adjustment_made and current_message and current_message != last_adjustment_message:
# Log the adjustment message instead of printing to stdout
monitor_logger.info(f"[GameMonitor] {current_message.strip()}")
last_adjustment_message = current_message
elif not window:
# Reset last message if window disappears
last_adjustment_message = ""
# Wait before the next check
time.sleep(config.MONITOR_INTERVAL_SECONDS)
# This part is theoretically unreachable in the new design as the loop is infinite
# and termination is handled externally by the parent process (main.py).
# monitor_logger.info("遊戲視窗監控腳本已停止。(Game window monitoring script stopped.)")
# Example usage (if run directly)
if __name__ == '__main__':
monitor_logger.info("直接運行 game_monitor.py。(Running game_monitor.py directly.)")
monitor_logger.info(f"將監控標題為 '{config.WINDOW_TITLE}' 的視窗。(Will monitor window with title '{config.WINDOW_TITLE}')")
monitor_logger.info(f"目標位置: ({config.GAME_WINDOW_X}, {config.GAME_WINDOW_Y}), 目標大小: {config.GAME_WINDOW_WIDTH}x{config.GAME_WINDOW_HEIGHT}")
monitor_logger.info(f"檢查間隔: {config.MONITOR_INTERVAL_SECONDS} 秒。(Check interval: {config.MONITOR_INTERVAL_SECONDS} seconds.)")
if config.ENABLE_SCHEDULED_RESTART:
monitor_logger.info(f"定時重啟已啟用,間隔: {config.RESTART_INTERVAL_MINUTES} 分鐘。(Scheduled restart enabled, interval: {config.RESTART_INTERVAL_MINUTES} minutes.)")
else:
monitor_logger.info("定時重啟已禁用。(Scheduled restart disabled.)")
monitor_logger.info("腳本將持續運行,請從啟動它的終端使用 Ctrl+C 或由父進程終止。(Script will run continuously. Stop with Ctrl+C from the launching terminal or termination by parent process.)")
try:
monitor_game_window() # Start the main loop directly
except KeyboardInterrupt:
monitor_logger.info("收到 Ctrl+C正在退出...(Received Ctrl+C, exiting...)")
except Exception as e:
monitor_logger.critical(f"監控過程中發生致命錯誤: {e}", exc_info=True)
sys.exit(1) # Exit with error code
finally:
monitor_logger.info("Game Monitor 腳本執行完畢。(Game Monitor script finished.)")

View File

@ -150,25 +150,24 @@ def get_system_prompt(
else: else:
# 如果沒有預載入數據,則使用完整記憶協議 # 如果沒有預載入數據,則使用完整記憶協議
memory_enforcement = f""" memory_enforcement = f"""
=== CHROMADB MEMORY RETRIEVAL PROTOCOL - Wolfhart Memory Integration === CHROMADB MEMORY RETRIEVAL PROTOCOL - Wolfhart Memory Integration
To personalize your responses to different users, you MUST follow this memory access protocol internally before responding: To personalize your responses to different users, you MUST follow this memory access protocol internally before responding:
**1. Basic User Retrieval:** **1. Basic User Retrieval:**
- Identify the username from `<CURRENT_MESSAGE>` - Identify the username from `<CURRENT_MESSAGE>`
# 修正:使用 BOT_MEMORY_COLLECTION 來獲取用戶資料 - Using the `tool_calls` mechanism, execute: `chroma_get_documents(collection_name: "{config.PROFILES_COLLECTION}", ids: ["{{username}}_profile"])`
- Using the `tool_calls` mechanism, execute: `chroma_query_documents(collection_name: "{config.BOT_MEMORY_COLLECTION}", query_texts: ["{{username}} profile"], n_results: 1)` # 只需最相關的1筆 - This step must be completed before any response generation
- This step must be completed before any response generation
**2. Context Expansion:** **2. Context Expansion:**
- Perform additional queries as needed, using the `tool_calls` mechanism: - Perform additional queries as needed, using the `tool_calls` mechanism:
- Relevant conversations: `chroma_query_documents(collection_name: "{config.CONVERSATIONS_COLLECTION}", query_texts: ["{{username}} {{query keywords}}"], n_results: 5)` - Relevant conversations: `chroma_query_documents(collection_name: "{config.CONVERSATIONS_COLLECTION}", query_texts: ["{{username}} {{query keywords}}"], n_results: 5)`
- Core personality reference: `chroma_query_documents(collection_name: "{config.BOT_MEMORY_COLLECTION}", query_texts: ["Wolfhart {{relevant attitude}}"], n_results: 3)` - Core personality reference: `chroma_query_documents(collection_name: "{config.BOT_MEMORY_COLLECTION}", query_texts: ["Wolfhart {{relevant attitude}}"], n_results: 3)`
**3. Other situation** **3. Other situation**
- You should check related memories when Users mention [capital_position], [capital_administrator_role], [server_hierarchy], [last_war], [winter_war], [excavations], [blueprints], [honor_points], [golden_eggs], or [diamonds], as these represent key game mechanics. - You should check related memories when Users mention [capital_position], [capital_administrator_role], [server_hierarchy], [last_war], [winter_war], [excavations], [blueprints], [honor_points], [golden_eggs], or [diamonds], as these represent key game mechanics.
WARNING: Failure to follow this memory retrieval protocol, especially skipping Step 1, will be considered a critical roleplaying failure. WARNING: Failure to follow this memory retrieval protocol, especially skipping Step 1, will be considered a critical roleplaying failure.
""" """
# 組合系統提示 # 組合系統提示
system_prompt = f""" system_prompt = f"""

182
main.py
View File

@ -16,6 +16,8 @@ from mcp import ClientSession, StdioServerParameters, types
# --- Keyboard Imports --- # --- Keyboard Imports ---
import threading import threading
import time import time
# Import MessageDeduplication from ui_interaction
from ui_interaction import MessageDeduplication
try: try:
import keyboard # Needs pip install keyboard import keyboard # Needs pip install keyboard
except ImportError: except ImportError:
@ -30,7 +32,6 @@ import llm_interaction
# Import UI module # Import UI module
import ui_interaction import ui_interaction
import chroma_client import chroma_client
# import game_monitor # No longer importing, will run as subprocess
import subprocess # Import subprocess module import subprocess # Import subprocess module
import signal import signal
import platform import platform
@ -65,9 +66,6 @@ trigger_queue: ThreadSafeQueue = ThreadSafeQueue() # UI Thread -> Main Loop
command_queue: ThreadSafeQueue = ThreadSafeQueue() # Main Loop -> UI Thread command_queue: ThreadSafeQueue = ThreadSafeQueue() # Main Loop -> UI Thread
# --- End Change --- # --- End Change ---
ui_monitor_task: asyncio.Task | None = None # To track the UI monitor task ui_monitor_task: asyncio.Task | None = None # To track the UI monitor task
game_monitor_process: subprocess.Popen | None = None # To store the game monitor subprocess
monitor_reader_task: asyncio.Future | None = None # Store the future from run_in_executor
stop_reader_event = threading.Event() # Event to signal the reader thread to stop
# --- Keyboard Shortcut State --- # --- Keyboard Shortcut State ---
script_paused = False script_paused = False
@ -107,16 +105,14 @@ def handle_f8():
except Exception as e: except Exception as e:
print(f"Error sending pause command (F8): {e}") print(f"Error sending pause command (F8): {e}")
else: else:
print("\n--- F8 pressed: Resuming script, resetting state, and resuming UI monitoring ---") print("\n--- F8 pressed: Resuming script and UI monitoring ---")
reset_command = {'action': 'reset_state'}
resume_command = {'action': 'resume'} resume_command = {'action': 'resume'}
try: try:
main_loop.call_soon_threadsafe(command_queue.put_nowait, reset_command)
# Add a small delay? Let's try without first. # Add a small delay? Let's try without first.
# time.sleep(0.05) # Short delay between commands if needed # time.sleep(0.05) # Short delay between commands if needed
main_loop.call_soon_threadsafe(command_queue.put_nowait, resume_command) main_loop.call_soon_threadsafe(command_queue.put_nowait, resume_command)
except Exception as e: except Exception as e:
print(f"Error sending reset/resume commands (F8): {e}") print(f"Error sending resume command (F8): {e}")
def handle_f9(): def handle_f9():
"""Handles F9 press: Initiates script shutdown.""" """Handles F9 press: Initiates script shutdown."""
@ -149,70 +145,6 @@ def keyboard_listener():
# --- End Keyboard Shortcut Handlers --- # --- End Keyboard Shortcut Handlers ---
# --- Game Monitor Signal Reader (Threaded Blocking Version) ---
def read_monitor_output(process: subprocess.Popen, queue: ThreadSafeQueue, loop: asyncio.AbstractEventLoop, stop_event: threading.Event):
"""Runs in a separate thread, reads stdout blocking, parses JSON, and puts commands in the queue."""
print("Game monitor output reader thread started.")
try:
while not stop_event.is_set():
if not process.stdout:
print("[Monitor Reader Thread] Subprocess stdout is None. Exiting thread.")
break
try:
# Blocking read - this is fine in a separate thread
line = process.stdout.readline()
except ValueError:
# Can happen if the pipe is closed during readline
print("[Monitor Reader Thread] ValueError on readline (pipe likely closed). Exiting thread.")
break
if not line:
# EOF reached (process terminated)
print("[Monitor Reader Thread] EOF reached on stdout. Exiting thread.")
break
line = line.strip()
if line:
# Log raw line immediately
print(f"[Monitor Reader Thread] Received raw line: '{line}'")
try:
data = json.loads(line)
action = data.get('action')
print(f"[Monitor Reader Thread] Parsed action: '{action}'") # Log parsed action
if action == 'pause_ui':
command = {'action': 'pause'}
print(f"[Monitor Reader Thread] Preparing to queue command: {command}") # Log before queueing
loop.call_soon_threadsafe(queue.put_nowait, command)
print("[Monitor Reader Thread] Pause command queued.") # Log after queueing
elif action == 'resume_ui':
# Removed direct resume_ui handling - ui_interaction will handle pause/resume based on restart_complete
print("[Monitor Reader Thread] Received old 'resume_ui' signal, ignoring.")
elif action == 'restart_complete':
command = {'action': 'handle_restart_complete'}
print(f"[Monitor Reader Thread] Received 'restart_complete' signal, preparing to queue command: {command}")
try:
loop.call_soon_threadsafe(queue.put_nowait, command)
print("[Monitor Reader Thread] 'handle_restart_complete' command queued.")
except Exception as q_err:
print(f"[Monitor Reader Thread] Error putting 'handle_restart_complete' command in queue: {q_err}")
else:
print(f"[Monitor Reader Thread] Received unknown action from monitor: {action}")
except json.JSONDecodeError:
print(f"[Monitor Reader Thread] ERROR: Could not decode JSON from monitor: '{line}'")
# Log the raw line that failed to parse
# print(f"[Monitor Reader Thread] Raw line that failed JSON decode: '{line}'") # Already logged raw line earlier
except Exception as e:
print(f"[Monitor Reader Thread] Error processing monitor output: {e}")
# No sleep needed here as readline() is blocking
except Exception as e:
# Catch broader errors in the thread loop itself
print(f"[Monitor Reader Thread] Thread loop error: {e}")
finally:
print("Game monitor output reader thread stopped.")
# --- End Game Monitor Signal Reader ---
# --- Chat Logging Function --- # --- Chat Logging Function ---
def log_chat_interaction(user_name: str, user_message: str, bot_name: str, bot_message: str, bot_thoughts: str | None = None): def log_chat_interaction(user_name: str, user_message: str, bot_name: str, bot_message: str, bot_thoughts: str | None = None):
"""Logs the chat interaction, including optional bot thoughts, to a date-stamped file if enabled.""" """Logs the chat interaction, including optional bot thoughts, to a date-stamped file if enabled."""
@ -318,7 +250,7 @@ if platform.system() == "Windows" and win32api and win32con:
# --- Cleanup Function --- # --- Cleanup Function ---
async def shutdown(): async def shutdown():
"""Gracefully closes connections and stops monitoring tasks/processes.""" """Gracefully closes connections and stops monitoring tasks/processes."""
global wolfhart_persona_details, ui_monitor_task, shutdown_requested, game_monitor_process, monitor_reader_task # Add monitor_reader_task global wolfhart_persona_details, ui_monitor_task, shutdown_requested
# Ensure shutdown is requested if called externally (e.g., Ctrl+C) # Ensure shutdown is requested if called externally (e.g., Ctrl+C)
if not shutdown_requested: if not shutdown_requested:
print("Shutdown initiated externally (e.g., Ctrl+C).") print("Shutdown initiated externally (e.g., Ctrl+C).")
@ -338,42 +270,7 @@ async def shutdown():
except Exception as e: except Exception as e:
print(f"Error while waiting for UI monitoring task cancellation: {e}") print(f"Error while waiting for UI monitoring task cancellation: {e}")
# 1b. Signal and Wait for Monitor Reader Thread # 2. Close MCP connections via AsyncExitStack
if monitor_reader_task: # Check if the future exists
if not stop_reader_event.is_set():
print("Signaling monitor output reader thread to stop...")
stop_reader_event.set()
# Wait for the thread to finish (the future returned by run_in_executor)
# This might block briefly, but it's necessary to ensure clean thread shutdown
# We don't await it directly in the async shutdown, but check if it's done
# A better approach might be needed if the thread blocks indefinitely
print("Waiting for monitor output reader thread to finish (up to 2s)...")
try:
# Wait for the future to complete with a timeout
await asyncio.wait_for(monitor_reader_task, timeout=2.0)
print("Monitor output reader thread finished.")
except asyncio.TimeoutError:
print("Warning: Monitor output reader thread did not finish within timeout.")
except asyncio.CancelledError:
print("Monitor output reader future was cancelled.") # Should not happen if we don't cancel it
except Exception as e:
print(f"Error waiting for monitor reader thread future: {e}")
# 2. Terminate Game Monitor Subprocess (after signaling reader thread)
if game_monitor_process:
print("Terminating game monitor subprocess...")
try:
game_monitor_process.terminate()
# Optionally wait for a short period or check return code
# game_monitor_process.wait(timeout=1)
print("Game monitor subprocess terminated.")
except Exception as e:
print(f"Error terminating game monitor subprocess: {e}")
finally:
game_monitor_process = None # Clear the reference
# 3. Close MCP connections via AsyncExitStack
# This will trigger the __aexit__ method of stdio_client contexts, # This will trigger the __aexit__ method of stdio_client contexts,
# which we assume handles terminating the server subprocesses it started. # which we assume handles terminating the server subprocesses it started.
print(f"Closing MCP Server connections (via AsyncExitStack)...") print(f"Closing MCP Server connections (via AsyncExitStack)...")
@ -555,7 +452,7 @@ def initialize_memory_system():
# --- Main Async Function --- # --- Main Async Function ---
async def run_main_with_exit_stack(): async def run_main_with_exit_stack():
"""Initializes connections, loads persona, starts UI monitor and main processing loop.""" """Initializes connections, loads persona, starts UI monitor and main processing loop."""
global initialization_successful, main_task, loop, wolfhart_persona_details, trigger_queue, ui_monitor_task, shutdown_requested, script_paused, command_queue, game_monitor_process, monitor_reader_task # Add monitor_reader_task to globals global initialization_successful, main_task, loop, wolfhart_persona_details, trigger_queue, ui_monitor_task, shutdown_requested, script_paused, command_queue
try: try:
# 1. Load Persona Synchronously (before async loop starts) # 1. Load Persona Synchronously (before async loop starts)
load_persona_from_file() # Corrected function load_persona_from_file() # Corrected function
@ -586,57 +483,38 @@ async def run_main_with_exit_stack():
# 5. Start UI Monitoring in a separate thread # 5. Start UI Monitoring in a separate thread
print("\n--- Starting UI monitoring thread ---") print("\n--- Starting UI monitoring thread ---")
# Use the new monitoring loop function, passing both queues # 5c. Create MessageDeduplication instance
deduplicator = MessageDeduplication(expiry_seconds=3600) # Default 1 hour
# Use the new monitoring loop function, passing both queues and the deduplicator
monitor_task = loop.create_task( monitor_task = loop.create_task(
asyncio.to_thread(ui_interaction.run_ui_monitoring_loop, trigger_queue, command_queue), # Pass command_queue asyncio.to_thread(ui_interaction.run_ui_monitoring_loop, trigger_queue, command_queue, deduplicator), # Pass command_queue and deduplicator
name="ui_monitor" name="ui_monitor"
) )
ui_monitor_task = monitor_task # Store task reference for shutdown ui_monitor_task = monitor_task # Store task reference for shutdown
# Note: UI task cancellation is handled in shutdown() # Note: UI task cancellation is handled in shutdown()
# 5b. Start Game Window Monitoring as a Subprocess # 5b. Game Window Monitoring is now handled by Setup.py
# global game_monitor_process, monitor_reader_task # Already declared global at function start
print("\n--- Starting Game Window monitoring as a subprocess ---")
try:
# Use sys.executable to ensure the same Python interpreter is used
# Capture stdout to read signals
game_monitor_process = subprocess.Popen(
[sys.executable, 'game_monitor.py'],
stdout=subprocess.PIPE, # Capture stdout
stderr=subprocess.PIPE, # Capture stderr for logging/debugging
text=True, # Decode stdout/stderr as text (UTF-8 by default)
bufsize=1, # Line buffered
# Ensure process creation flags are suitable for Windows if needed
# creationflags=subprocess.CREATE_NO_WINDOW # Example: Hide console window
)
print(f"Game monitor subprocess started (PID: {game_monitor_process.pid}).")
# Start the thread to read monitor output if process started successfully # 5d. Start Periodic Cleanup Timer for Deduplicator
if game_monitor_process.stdout: def periodic_cleanup():
# Run the blocking reader function in a separate thread using the default executor if not shutdown_requested: # Only run if not shutting down
monitor_reader_task = loop.run_in_executor( print("Main Thread: Running periodic deduplicator cleanup...")
None, # Use default ThreadPoolExecutor deduplicator.purge_expired()
read_monitor_output, # The function to run # Reschedule the timer
game_monitor_process, # Arguments for the function... cleanup_timer = threading.Timer(600, periodic_cleanup) # 10 minutes
command_queue, cleanup_timer.daemon = True
loop, cleanup_timer.start()
stop_reader_event # Pass the stop event
)
print("Monitor output reader thread submitted to executor.")
else: else:
print("Error: Could not access game monitor subprocess stdout.") print("Main Thread: Shutdown requested, not rescheduling deduplicator cleanup.")
monitor_reader_task = None
# Optionally, start a task to read stderr as well for debugging
# stderr_reader_task = loop.create_task(read_stderr(game_monitor_process), name="monitor_stderr_reader")
except FileNotFoundError:
print("Error: 'game_monitor.py' not found. Cannot start game monitor subprocess.")
game_monitor_process = None
except Exception as e:
print(f"Error starting game monitor subprocess: {e}")
game_monitor_process = None
print("\n--- Starting periodic deduplicator cleanup timer (10 min interval) ---")
initial_cleanup_timer = threading.Timer(600, periodic_cleanup)
initial_cleanup_timer.daemon = True
initial_cleanup_timer.start()
# Note: This timer will run in a separate thread.
# Ensure it's handled correctly on shutdown if it holds resources.
# Since it's a daemon thread and reschedules itself, it should exit when the main program exits.
# 6. Start the main processing loop (non-blocking check on queue) # 6. Start the main processing loop (non-blocking check on queue)
print("\n--- Wolfhart chatbot has started (waiting for triggers) ---") print("\n--- Wolfhart chatbot has started (waiting for triggers) ---")

42
memory_backup.py Normal file
View File

@ -0,0 +1,42 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Wolf Chat 記憶備份工具
用於手動執行記憶備份或啟動定時調度器
"""
import sys
import argparse
import datetime
from memory_manager import run_memory_backup_manual, MemoryScheduler # Updated import
import config # Import config to access default schedule times
def main():
parser = argparse.ArgumentParser(description='Wolf Chat 記憶備份工具')
parser.add_argument('--backup', action='store_true', help='執行一次性備份 (預設為昨天,除非指定 --date)')
parser.add_argument('--date', type=str, help='處理指定日期的日誌 (YYYY-MM-DD格式) for --backup')
parser.add_argument('--schedule', action='store_true', help='啟動定時調度器')
parser.add_argument('--hour', type=int, help='備份時間小時0-23for --schedule')
parser.add_argument('--minute', type=int, help='備份時間分鐘0-59for --schedule')
args = parser.parse_args()
if args.backup:
# The date logic is now handled inside run_memory_backup_manual
run_memory_backup_manual(args.date)
elif args.schedule:
scheduler = MemoryScheduler()
# Use provided hour/minute or fallback to config defaults
backup_hour = args.hour if args.hour is not None else getattr(config, 'MEMORY_BACKUP_HOUR', 0)
backup_minute = args.minute if args.minute is not None else getattr(config, 'MEMORY_BACKUP_MINUTE', 0)
scheduler.schedule_daily_backup(backup_hour, backup_minute)
scheduler.start()
else:
print("請指定操作: --backup 或 --schedule")
parser.print_help()
sys.exit(1)
if __name__ == "__main__":
main()

783
memory_manager.py Normal file
View File

@ -0,0 +1,783 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Wolf Chat 記憶管理模組
處理聊天記錄解析記憶生成和ChromaDB寫入的一體化模組
"""
import os
import re
import json
import time
import asyncio
import datetime
import schedule
from pathlib import Path
from typing import Dict, List, Optional, Any, Union, Callable
from functools import wraps
# import chromadb # No longer directly needed by ChromaDBManager
# from chromadb.utils import embedding_functions # No longer directly needed by ChromaDBManager
from openai import AsyncOpenAI
import config
import chroma_client # Import the centralized chroma client
# =============================================================================
# 重試裝飾器
# =============================================================================
def retry_operation(max_attempts: int = 3, delay: float = 1.0):
"""重試裝飾器,用於數據庫操作"""
def decorator(func: Callable) -> Callable:
@wraps(func)
def wrapper(*args, **kwargs) -> Any:
attempts = 0
last_error = None
while attempts < max_attempts:
try:
return func(*args, **kwargs)
except Exception as e:
attempts += 1
last_error = e
print(f"操作失敗,嘗試次數 {attempts}/{max_attempts}: {e}")
if attempts < max_attempts:
# 指數退避策略
sleep_time = delay * (2 ** (attempts - 1))
print(f"等待 {sleep_time:.2f} 秒後重試...")
time.sleep(sleep_time)
print(f"操作失敗達到最大嘗試次數 ({max_attempts}),最後錯誤: {last_error}")
# 在生產環境中,您可能希望引發最後一個錯誤或返回一個特定的錯誤指示符
# 根據您的需求,返回 False 可能適合某些情況
return False # 或者 raise last_error
return wrapper
return decorator
# =============================================================================
# 日誌解析部分
# =============================================================================
def parse_log_file(log_path: str) -> List[Dict[str, str]]:
"""解析日誌文件,提取對話內容"""
conversations = []
with open(log_path, 'r', encoding='utf-8') as f:
content = f.read()
# 使用分隔符分割對話
dialogue_blocks = content.split('---')
for block in dialogue_blocks:
if not block.strip():
continue
# 解析對話塊
timestamp_pattern = r'\[([\d-]+ [\d:]+)\]'
user_pattern = r'User \(([^)]+)\): (.+?)(?=\[|$)'
bot_thoughts_pattern = r'Bot \(([^)]+)\) Thoughts: (.+?)(?=\[|$)'
bot_dialogue_pattern = r'Bot \(([^)]+)\) Dialogue: (.+?)(?=\[|$)'
# 提取時間戳記
timestamp_match = re.search(timestamp_pattern, block)
user_match = re.search(user_pattern, block, re.DOTALL)
bot_thoughts_match = re.search(bot_thoughts_pattern, block, re.DOTALL)
bot_dialogue_match = re.search(bot_dialogue_pattern, block, re.DOTALL)
if timestamp_match and user_match and bot_dialogue_match:
timestamp = timestamp_match.group(1)
user_name = user_match.group(1)
user_message = user_match.group(2).strip()
bot_name = bot_dialogue_match.group(1)
bot_message = bot_dialogue_match.group(2).strip()
bot_thoughts = bot_thoughts_match.group(2).strip() if bot_thoughts_match else ""
# 創建對話記錄
conversation = {
"timestamp": timestamp,
"user_name": user_name,
"user_message": user_message,
"bot_name": bot_name,
"bot_message": bot_message,
"bot_thoughts": bot_thoughts
}
conversations.append(conversation)
return conversations
def get_logs_for_date(date: datetime.date, log_dir: str = "chat_logs") -> List[Dict[str, str]]:
"""獲取指定日期的所有日誌文件"""
date_str = date.strftime("%Y-%m-%d")
log_path = os.path.join(log_dir, f"{date_str}.log")
if os.path.exists(log_path):
return parse_log_file(log_path)
return []
def group_conversations_by_user(conversations: List[Dict[str, str]]) -> Dict[str, List[Dict[str, str]]]:
"""按用戶分組對話"""
user_conversations = {}
for conv in conversations:
user_name = conv["user_name"]
if user_name not in user_conversations:
user_conversations[user_name] = []
user_conversations[user_name].append(conv)
return user_conversations
# =============================================================================
# 記憶生成器部分
# =============================================================================
class MemoryGenerator:
def __init__(self, profile_model: Optional[str] = None, summary_model: Optional[str] = None):
self.profile_client = AsyncOpenAI(
api_key=config.OPENAI_API_KEY,
base_url=config.OPENAI_API_BASE_URL if config.OPENAI_API_BASE_URL else None,
)
self.summary_client = AsyncOpenAI(
api_key=config.OPENAI_API_KEY,
base_url=config.OPENAI_API_BASE_URL if config.OPENAI_API_BASE_URL else None,
)
self.profile_model = profile_model or getattr(config, 'MEMORY_PROFILE_MODEL', config.LLM_MODEL)
self.summary_model = summary_model or getattr(config, 'MEMORY_SUMMARY_MODEL', "mistral-7b-instruct")
self.persona_data = self._load_persona_data()
def _load_persona_data(self, persona_file: str = "persona.json") -> Dict[str, Any]:
"""Load persona data from JSON file."""
try:
with open(persona_file, 'r', encoding='utf-8') as f:
return json.load(f)
except FileNotFoundError:
print(f"Warning: Persona file '{persona_file}' not found. Proceeding without persona data.")
return {}
except json.JSONDecodeError:
print(f"Warning: Error decoding JSON from '{persona_file}'. Proceeding without persona data.")
return {}
async def generate_user_profile(
self,
user_name: str,
conversations: List[Dict[str, str]],
existing_profile: Optional[Dict[str, Any]] = None
) -> Optional[Dict[str, Any]]:
"""Generate or update user profile based on conversations"""
system_prompt = self._get_profile_system_prompt(config.PERSONA_NAME, existing_profile)
# Prepare user conversation records
conversation_text = self._format_conversations_for_prompt(conversations)
user_prompt = f"""
Please generate a complete profile for user '{user_name}':
Conversation history:
{conversation_text}
Please analyze this user based on the conversation history and your personality, and generate or update a profile in JSON format, including:
1. User's personality traits
2. Relationship with you ({config.PERSONA_NAME})
3. Your subjective perception of the user
4. Important interaction records
5. Any other information you think is important
Please ensure the output is valid JSON format, using the following format:
```json
{{
"id": "{user_name}_profile",
"type": "user_profile",
"username": "{user_name}",
"content": {{
"personality": "User personality traits...",
"relationship_with_bot": "Description of relationship with me...",
"bot_perception": "My subjective perception of the user...",
"notable_interactions": ["Important interaction 1", "Important interaction 2"]
}},
"last_updated": "YYYY-MM-DD",
"metadata": {{
"priority": 1.0,
"word_count": 0
}}
}}
```
When evaluating, please pay special attention to my "thoughts" section, as that reflects my true thoughts about the user.
"""
try:
response = await self.profile_client.chat.completions.create(
model=self.profile_model,
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
],
temperature=0.7
)
# Parse JSON response
profile_text = response.choices[0].message.content
# Extract JSON part
json_match = re.search(r'```json\s*(.*?)\s*```', profile_text, re.DOTALL)
if json_match:
profile_json_str = json_match.group(1)
else:
# Try parsing directly
profile_json_str = profile_text
profile_json = json.loads(profile_json_str)
# After parsing the initial JSON response
content_str = json.dumps(profile_json["content"], ensure_ascii=False)
if len(content_str) > 5000:
# Too long - request a more concise version
condensed_prompt = f"Your profile is {len(content_str)} characters. Create a new version under 5000 characters. Keep the same structure but be extremely concise."
condensed_response = await self.profile_client.chat.completions.create(
model=self.profile_model,
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt},
{"role": "assistant", "content": profile_json_str},
{"role": "user", "content": condensed_prompt}
],
temperature=0.5
)
# Extract the condensed JSON
condensed_text = condensed_response.choices[0].message.content
# Parse JSON and update profile_json
json_match = re.search(r'```json\s*(.*?)\s*```', condensed_text, re.DOTALL)
if json_match:
profile_json_str = json_match.group(1)
else:
profile_json_str = condensed_text
profile_json = json.loads(profile_json_str)
content_str = json.dumps(profile_json["content"], ensure_ascii=False) # Recalculate content_str
profile_json["metadata"]["word_count"] = len(content_str)
profile_json["last_updated"] = datetime.datetime.now().strftime("%Y-%m-%d")
return profile_json
except Exception as e:
print(f"Error generating user profile: {e}")
return None
async def generate_conversation_summary(
self,
user_name: str,
conversations: List[Dict[str, str]]
) -> Optional[Dict[str, Any]]:
"""Generate conversation summary for user"""
system_prompt = f"""
You are {config.PERSONA_NAME}, an intelligent conversational AI.
Your task is to summarize the conversations between you and the user, preserving key information and emotional changes.
The summary should be concise yet informative, not exceeding 250 words.
"""
# Prepare user conversation records
conversation_text = self._format_conversations_for_prompt(conversations)
# Generate current date
today = datetime.datetime.now().strftime("%Y-%m-%d")
user_prompt = f"""
Please summarize my conversation with user '{user_name}' on {today}:
{conversation_text}
Please output in JSON format, as follows:
```json
{{{{
"id": "{user_name}_summary_{today.replace('-', '')}",
"type": "dialogue_summary",
"date": "{today}",
"username": "{user_name}",
"content": "Conversation summary content...",
"key_points": ["Key point 1", "Key point 2"],
"metadata": {{{{
"priority": 0.7,
"word_count": 0
}}}}
}}}}
```
The summary should reflect my perspective and views on the conversation, not a neutral third-party perspective.
"""
try:
response = await self.summary_client.chat.completions.create(
model=self.summary_model,
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
],
temperature=0.5
)
# Parse JSON response
summary_text = response.choices[0].message.content
# Extract JSON part
json_match = re.search(r'```json\s*(.*?)\s*```', summary_text, re.DOTALL)
if json_match:
summary_json_str = json_match.group(1)
else:
# Try parsing directly
summary_json_str = summary_text
summary_json = json.loads(summary_json_str)
# Add or update word count
summary_json["metadata"]["word_count"] = len(summary_json["content"])
return summary_json
except Exception as e:
print(f"Error generating conversation summary: {e}")
return None
def _get_profile_system_prompt(self, bot_name: str, existing_profile: Optional[Dict[str, Any]] = None) -> str:
"""Get system prompt for generating user profile"""
persona_details = ""
if self.persona_data:
# Construct a string from persona_data, focusing on key aspects
# We can be selective here or dump the whole thing if the model can handle it.
# For now, let's include a significant portion.
persona_info_to_include = {
"name": self.persona_data.get("name"),
"personality": self.persona_data.get("personality"),
"language_social": self.persona_data.get("language_social"),
"values_interests_goals": self.persona_data.get("values_interests_goals"),
"preferences_reactions": self.persona_data.get("preferences_reactions")
}
persona_details = f"""
Your detailed persona profile is as follows:
```json
{json.dumps(persona_info_to_include, ensure_ascii=False, indent=2)}
```
Please embody this persona when analyzing the user and generating their profile.
"""
system_prompt = f"""
You are {bot_name}, an AI assistant with deep analytical capabilities.
{persona_details}
Your task is to analyze the user's interactions with you, creating user profiles.
CRITICAL: The ENTIRE profile content must be under 5000 characters total. Be extremely concise.
The profile should:
1. Be completely based on your character's perspective
2. Focus only on key personality traits and core relationship dynamics
3. Include only the most significant interactions
The output should be valid JSON format, following the provided template.
"""
if existing_profile:
system_prompt += f"""
You already have an existing user profile, please update based on this:
```json
{json.dumps(existing_profile, ensure_ascii=False, indent=2)}
```
Please retain valid information, integrate new observations, and resolve any contradictions or outdated information.
"""
return system_prompt
def _format_conversations_for_prompt(self, conversations: List[Dict[str, str]]) -> str:
"""Format conversation records for prompt"""
conversation_text = ""
for i, conv in enumerate(conversations):
conversation_text += f"Conversation {i+1}:\n"
conversation_text += f"Time: {conv['timestamp']}\n"
conversation_text += f"User ({conv['user_name']}): {conv['user_message']}\n"
if conv.get('bot_thoughts'): # Check if bot_thoughts exists
conversation_text += f"My thoughts: {conv['bot_thoughts']}\n"
conversation_text += f"My response: {conv['bot_message']}\n\n"
return conversation_text
# =============================================================================
# ChromaDB操作部分
# =============================================================================
class ChromaDBManager:
def __init__(self, collection_name: Optional[str] = None):
self.collection_name = collection_name or config.BOT_MEMORY_COLLECTION
self._db_collection = None # Cache for the collection object
def _get_db_collection(self):
"""Helper to get the collection object from chroma_client"""
if self._db_collection is None:
# Use the centralized get_collection function
self._db_collection = chroma_client.get_collection(self.collection_name)
if self._db_collection is None:
# This indicates a failure in chroma_client to provide the collection
raise RuntimeError(f"Failed to get or create collection '{self.collection_name}' via chroma_client. Check chroma_client logs.")
return self._db_collection
@retry_operation(max_attempts=3, delay=1.0)
def upsert_user_profile(self, profile_data: Dict[str, Any]) -> bool:
"""寫入或更新用戶檔案"""
collection = self._get_db_collection()
if not profile_data or not isinstance(profile_data, dict):
print("無效的檔案數據")
return False
try:
user_id = profile_data.get("id")
if not user_id:
print("檔案缺少ID字段")
return False
# 準備元數據
# Note: ChromaDB's upsert handles existence check implicitly.
# The .get call here isn't strictly necessary for the upsert operation itself,
# but might be kept if there was other logic depending on prior existence.
# For a clean upsert, it can be removed. Let's assume it's not critical for now.
# results = collection.get(ids=[user_id], limit=1) # Optional: if needed for pre-check logic
metadata = {
"id": user_id,
"type": "user_profile",
"username": profile_data.get("username", ""),
"priority": 1.0 # 高優先級
}
# 添加其他元數據
if "metadata" in profile_data and isinstance(profile_data["metadata"], dict):
for k, v in profile_data["metadata"].items():
if k not in ["id", "type", "username", "priority"]: # Avoid overwriting key fields
# 處理非基本類型的值
if isinstance(v, (list, dict, tuple)):
# 轉換為字符串
metadata[k] = json.dumps(v, ensure_ascii=False)
else:
metadata[k] = v
# 序列化內容
content_doc = json.dumps(profile_data.get("content", {}), ensure_ascii=False)
# 寫入或更新
collection.upsert(
ids=[user_id],
documents=[content_doc],
metadatas=[metadata]
)
print(f"Upserted user profile: {user_id} into collection {self.collection_name}")
return True
except Exception as e:
print(f"寫入用戶檔案時出錯: {e}")
return False
@retry_operation(max_attempts=3, delay=1.0)
def upsert_conversation_summary(self, summary_data: Dict[str, Any]) -> bool:
"""寫入對話總結"""
collection = self._get_db_collection()
if not summary_data or not isinstance(summary_data, dict):
print("無效的總結數據")
return False
try:
summary_id = summary_data.get("id")
if not summary_id:
print("總結缺少ID字段")
return False
# 準備元數據
metadata = {
"id": summary_id,
"type": "dialogue_summary",
"username": summary_data.get("username", ""),
"date": summary_data.get("date", ""),
"priority": 0.7 # 低優先級
}
# 添加其他元數據
if "metadata" in summary_data and isinstance(summary_data["metadata"], dict):
for k, v in summary_data["metadata"].items():
if k not in ["id", "type", "username", "date", "priority"]:
# 處理非基本類型的值
if isinstance(v, (list, dict, tuple)):
# 轉換為字符串
metadata[k] = json.dumps(v, ensure_ascii=False)
else:
metadata[k] = v
# 獲取內容
content_doc = summary_data.get("content", "")
if "key_points" in summary_data and summary_data["key_points"]:
key_points_str = "\n".join([f"- {point}" for point in summary_data["key_points"]])
content_doc += f"\n\n關鍵點:\n{key_points_str}"
# 寫入數據
collection.upsert(
ids=[summary_id],
documents=[content_doc],
metadatas=[metadata]
)
print(f"Upserted conversation summary: {summary_id} into collection {self.collection_name}")
return True
except Exception as e:
print(f"寫入對話總結時出錯: {e}")
return False
def get_existing_profile(self, username: str) -> Optional[Dict[str, Any]]:
"""獲取現有的用戶檔案"""
collection = self._get_db_collection()
try:
profile_id = f"{username}_profile"
results = collection.get(
ids=[profile_id],
limit=1
)
if results and results["ids"] and results["documents"]:
idx = 0
# Ensure document is not None before trying to load
doc_content = results["documents"][idx]
if doc_content is None:
print(f"Warning: Document for profile {profile_id} is None.")
return None
profile_data = {
"id": profile_id,
"type": "user_profile",
"username": username,
"content": json.loads(doc_content),
"last_updated": "", # Will be populated from metadata if exists
"metadata": {}
}
# 獲取元數據
if results["metadatas"] and results["metadatas"][idx]:
metadata_db = results["metadatas"][idx]
for k, v in metadata_db.items():
if k == "last_updated":
profile_data["last_updated"] = str(v) # Ensure it's a string
elif k not in ["id", "type", "username"]:
profile_data["metadata"][k] = v
return profile_data
return None
except json.JSONDecodeError as je:
print(f"Error decoding JSON for profile {username}: {je}")
return None
except Exception as e:
print(f"獲取用戶檔案時出錯 for {username}: {e}")
return None
# =============================================================================
# 記憶管理器
# =============================================================================
class MemoryManager:
def __init__(self):
self.memory_generator = MemoryGenerator(
profile_model=getattr(config, 'MEMORY_PROFILE_MODEL', config.LLM_MODEL),
summary_model=getattr(config, 'MEMORY_SUMMARY_MODEL', "mistral-7b-instruct")
)
self.db_manager = ChromaDBManager(collection_name=config.BOT_MEMORY_COLLECTION)
# Ensure LOG_DIR is correctly referenced from config
self.log_dir = getattr(config, 'LOG_DIR', "chat_logs")
async def process_daily_logs(self, date: Optional[datetime.date] = None) -> None:
"""處理指定日期的日誌(預設為昨天)"""
# 如果未指定日期,使用昨天
if date is None:
date = datetime.datetime.now().date() - datetime.timedelta(days=1)
date_str = date.strftime("%Y-%m-%d")
log_path = os.path.join(self.log_dir, f"{date_str}.log")
if not os.path.exists(log_path):
print(f"找不到日誌文件: {log_path}")
return
print(f"開始處理日誌文件: {log_path}")
# 解析日誌
conversations = parse_log_file(log_path)
if not conversations:
print(f"日誌文件 {log_path} 為空或未解析到對話。")
return
print(f"解析到 {len(conversations)} 條對話記錄")
# 按用戶分組
user_conversations = group_conversations_by_user(conversations)
print(f"共有 {len(user_conversations)} 個用戶有對話")
# 為每個用戶生成/更新檔案和對話總結
failed_users = []
for username, convs in user_conversations.items():
print(f"處理用戶 '{username}'{len(convs)} 條對話")
try:
# 獲取現有檔案
existing_profile = self.db_manager.get_existing_profile(username)
# 生成或更新用戶檔案
profile_data = await self.memory_generator.generate_user_profile(
username, convs, existing_profile
)
if profile_data:
profile_success = self.db_manager.upsert_user_profile(profile_data)
if not profile_success:
print(f"警告: 無法保存用戶 '{username}' 的檔案")
# 生成對話總結
summary_data = await self.memory_generator.generate_conversation_summary(
username, convs
)
if summary_data:
summary_success = self.db_manager.upsert_conversation_summary(summary_data)
if not summary_success:
print(f"警告: 無法保存用戶 '{username}' 的對話總結")
except Exception as e:
print(f"處理用戶 '{username}' 時出錯: {e}")
failed_users.append(username)
continue # 繼續處理下一個用戶
if failed_users:
print(f"以下用戶處理失敗: {', '.join(failed_users)}")
print(f"日誌處理完成: {log_path}")
# =============================================================================
# 定時調度器
# =============================================================================
class MemoryScheduler:
def __init__(self):
self.memory_manager = MemoryManager()
self.scheduled = False # To track if a job is already scheduled
def schedule_daily_backup(self, hour: Optional[int] = None, minute: Optional[int] = None) -> None:
"""設置每日備份時間"""
# Clear any existing jobs to prevent duplicates if called multiple times
schedule.clear()
backup_hour = hour if hour is not None else getattr(config, 'MEMORY_BACKUP_HOUR', 0)
backup_minute = minute if minute is not None else getattr(config, 'MEMORY_BACKUP_MINUTE', 0)
time_str = f"{backup_hour:02d}:{backup_minute:02d}"
# 設置定時任務
schedule.every().day.at(time_str).do(self._run_daily_backup_job)
self.scheduled = True
print(f"已設置每日備份時間: {time_str}")
def _run_daily_backup_job(self) -> None:
"""Helper to run the async job for scheduler."""
print(f"開始執行每日記憶備份 - {datetime.datetime.now()}")
try:
# Create a new event loop for the thread if not running in main thread
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
loop.run_until_complete(self.memory_manager.process_daily_logs())
loop.close()
print(f"每日記憶備份完成 - {datetime.datetime.now()}")
except Exception as e:
print(f"執行每日備份時出錯: {e}")
# schedule.every().day.at...do() expects the job function to return schedule.CancelJob
# if it should not be rescheduled. Otherwise, it's rescheduled.
# For a daily job, we want it to reschedule, so we don't return CancelJob.
def start(self) -> None:
"""啟動調度器"""
if not self.scheduled:
self.schedule_daily_backup() # Schedule with default/config times if not already
print("調度器已啟動按Ctrl+C停止")
try:
while True:
schedule.run_pending()
time.sleep(1) # Check every second
except KeyboardInterrupt:
print("調度器已停止")
except Exception as e:
print(f"調度器運行時發生錯誤: {e}")
finally:
print("調度器正在關閉...")
# =============================================================================
# 直接運行入口
# =============================================================================
def run_memory_backup_manual(date_str: Optional[str] = None) -> None:
"""手動執行記憶備份 for a specific date string or yesterday."""
target_date = None
if date_str:
try:
target_date = datetime.datetime.strptime(date_str, "%Y-%m-%d").date()
except ValueError:
print(f"無效的日期格式: {date_str}。將使用昨天的日期。")
target_date = datetime.datetime.now().date() - datetime.timedelta(days=1)
else:
target_date = datetime.datetime.now().date() - datetime.timedelta(days=1)
print(f"未指定日期,將處理昨天的日誌: {target_date.strftime('%Y-%m-%d')}")
memory_manager = MemoryManager()
# Setup asyncio event loop for the manual run
loop = asyncio.get_event_loop()
if loop.is_closed():
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(memory_manager.process_daily_logs(target_date))
except Exception as e:
print(f"手動執行記憶備份時出錯: {e}")
finally:
# If we created a new loop, we might want to close it.
# However, if get_event_loop() returned an existing running loop,
# we should not close it here.
# For simplicity in a script, this might be okay, but in complex apps, be careful.
# loop.close() # Be cautious with this line.
pass
print("記憶備份完成")
# 如果直接運行此腳本
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser(description='Wolf Chat 記憶管理模組')
parser.add_argument('--backup', action='store_true', help='執行一次性備份 (預設為昨天,除非指定 --date)')
parser.add_argument('--date', type=str, help='處理指定日期的日誌 (YYYY-MM-DD格式) for --backup')
parser.add_argument('--schedule', action='store_true', help='啟動定時調度器')
parser.add_argument('--hour', type=int, help='備份時間小時0-23for --schedule')
parser.add_argument('--minute', type=int, help='備份時間分鐘0-59for --schedule')
args = parser.parse_args()
if args.backup:
run_memory_backup_manual(args.date)
elif args.schedule:
scheduler = MemoryScheduler()
# Pass hour/minute only if they are provided, otherwise defaults in schedule_daily_backup will be used
scheduler.schedule_daily_backup(
hour=args.hour if args.hour is not None else getattr(config, 'MEMORY_BACKUP_HOUR', 0),
minute=args.minute if args.minute is not None else getattr(config, 'MEMORY_BACKUP_MINUTE', 0)
)
scheduler.start()
else:
print("請指定操作: --backup 或 --schedule")
parser.print_help()

529
reembed_chroma_data.py Normal file
View File

@ -0,0 +1,529 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
重新嵌入工具 (Reembedding Tool)
這個腳本用於將現有ChromaDB集合中的數據使用新的嵌入模型重新計算向量並儲存
"""
import os
import sys
import json
import time
import argparse
import shutil
from datetime import datetime
from typing import List, Dict, Any, Optional, Tuple
from tqdm import tqdm # 進度條
try:
import chromadb
from chromadb.utils import embedding_functions
except ImportError:
print("錯誤: 請先安裝 chromadb: pip install chromadb")
sys.exit(1)
try:
from sentence_transformers import SentenceTransformer
except ImportError:
print("錯誤: 請先安裝 sentence-transformers: pip install sentence-transformers")
sys.exit(1)
# 嘗試導入配置
try:
import config
except ImportError:
print("警告: 無法導入config.py將使用預設設定")
# 建立最小配置
class MinimalConfig:
CHROMA_DATA_DIR = "chroma_data"
BOT_MEMORY_COLLECTION = "wolfhart_memory"
CONVERSATIONS_COLLECTION = "wolfhart_memory"
PROFILES_COLLECTION = "wolfhart_memory"
config = MinimalConfig()
def parse_args():
"""處理命令行參數"""
parser = argparse.ArgumentParser(description='ChromaDB 數據重新嵌入工具')
parser.add_argument('--new-model', type=str,
default="sentence-transformers/paraphrase-multilingual-mpnet-base-v2",
help='新的嵌入模型名稱 (預設: sentence-transformers/paraphrase-multilingual-mpnet-base-v2)')
parser.add_argument('--collections', type=str, nargs='+',
help=f'要處理的集合名稱列表,空白分隔 (預設: 使用配置中的所有集合)')
parser.add_argument('--backup', action='store_true',
help='在處理前備份資料庫 (推薦)')
parser.add_argument('--batch-size', type=int, default=100,
help='批處理大小 (預設: 100)')
parser.add_argument('--temp-collection-suffix', type=str, default="_temp_new",
help='臨時集合的後綴名稱 (預設: _temp_new)')
parser.add_argument('--dry-run', action='store_true',
help='模擬執行但不實際修改資料')
parser.add_argument('--confirm-dangerous', action='store_true',
help='確認執行危險操作(例如刪除集合)')
return parser.parse_args()
def backup_chroma_directory(chroma_dir: str) -> str:
"""備份ChromaDB數據目錄
Args:
chroma_dir: ChromaDB數據目錄路徑
Returns:
備份目錄的路徑
"""
if not os.path.exists(chroma_dir):
print(f"錯誤: ChromaDB目錄 '{chroma_dir}' 不存在")
sys.exit(1)
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
backup_dir = f"{chroma_dir}_backup_{timestamp}"
print(f"備份資料庫從 '{chroma_dir}''{backup_dir}'...")
shutil.copytree(chroma_dir, backup_dir)
print(f"備份完成: {backup_dir}")
return backup_dir
def create_embedding_function(model_name: str):
"""創建嵌入函數
Args:
model_name: 嵌入模型名稱
Returns:
嵌入函數對象
"""
if not model_name:
print("使用ChromaDB預設嵌入模型")
return embedding_functions.DefaultEmbeddingFunction()
print(f"正在加載嵌入模型: {model_name}")
try:
# 直接使用SentenceTransformerEmbeddingFunction
from chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction
embedding_function = SentenceTransformerEmbeddingFunction(model_name=model_name)
# 預熱模型
_ = embedding_function(["."])
return embedding_function
except Exception as e:
print(f"錯誤: 無法加載模型 '{model_name}': {e}")
print("退回到預設嵌入模型")
return embedding_functions.DefaultEmbeddingFunction()
def get_collection_names(client, default_collections: List[str]) -> List[str]:
"""獲取所有可用的集合名稱
Args:
client: ChromaDB客戶端
default_collections: 預設集合列表
Returns:
可用的集合名稱列表
"""
try:
all_collections = client.list_collections()
collection_names = [col.name for col in all_collections]
if collection_names:
return collection_names
else:
print("警告: 沒有找到集合,將使用預設集合")
return default_collections
except Exception as e:
print(f"獲取集合列表失敗: {e}")
print("將使用預設集合")
return default_collections
def fetch_collection_data(client, collection_name: str, batch_size: int = 100) -> Dict[str, Any]:
"""從集合中提取所有數據
Args:
client: ChromaDB客戶端
collection_name: 集合名稱
batch_size: 批處理大小
Returns:
集合數據字典包含ids, documents, metadatas
"""
try:
collection = client.get_collection(name=collection_name)
# 獲取該集合中的項目總數
count_result = collection.count()
if count_result == 0:
print(f"集合 '{collection_name}' 是空的")
return {"ids": [], "documents": [], "metadatas": []}
print(f"從集合 '{collection_name}' 中讀取 {count_result} 項數據...")
# 分批獲取數據
all_ids = []
all_documents = []
all_metadatas = []
offset = 0
with tqdm(total=count_result, desc=f"正在讀取 {collection_name}") as pbar:
while True:
# 注意: 使用include參數指定只獲取需要的數據
batch_result = collection.get(
limit=batch_size,
offset=offset,
include=["documents", "metadatas"]
)
batch_ids = batch_result.get("ids", [])
if not batch_ids:
break
all_ids.extend(batch_ids)
all_documents.extend(batch_result.get("documents", []))
all_metadatas.extend(batch_result.get("metadatas", []))
offset += len(batch_ids)
pbar.update(len(batch_ids))
if len(batch_ids) < batch_size:
break
return {
"ids": all_ids,
"documents": all_documents,
"metadatas": all_metadatas
}
except Exception as e:
print(f"從集合 '{collection_name}' 獲取數據時出錯: {e}")
return {"ids": [], "documents": [], "metadatas": []}
def create_and_populate_collection(
client,
collection_name: str,
data: Dict[str, Any],
embedding_func,
batch_size: int = 100,
dry_run: bool = False
) -> bool:
"""創建新集合並填充數據
Args:
client: ChromaDB客戶端
collection_name: 集合名稱
data: 要添加的數據 (ids, documents, metadatas)
embedding_func: 嵌入函數
batch_size: 批處理大小
dry_run: 是否只模擬執行
Returns:
成功返回True否則返回False
"""
if dry_run:
print(f"[模擬] 將創建集合 '{collection_name}' 並添加 {len(data['ids'])} 項數據")
return True
try:
# 檢查集合是否已存在
if collection_name in [col.name for col in client.list_collections()]:
client.delete_collection(collection_name)
# 創建新集合
collection = client.create_collection(
name=collection_name,
embedding_function=embedding_func
)
# 如果沒有數據,直接返回
if not data["ids"]:
print(f"集合 '{collection_name}' 創建完成,但沒有數據添加")
return True
# 分批添加數據
total_items = len(data["ids"])
with tqdm(total=total_items, desc=f"正在填充 {collection_name}") as pbar:
for i in range(0, total_items, batch_size):
end_idx = min(i + batch_size, total_items)
batch_ids = data["ids"][i:end_idx]
batch_docs = data["documents"][i:end_idx]
batch_meta = data["metadatas"][i:end_idx]
# 處理可能的None值
processed_docs = []
for doc in batch_docs:
if doc is None:
processed_docs.append("") # 使用空字符串替代None
else:
processed_docs.append(doc)
collection.add(
ids=batch_ids,
documents=processed_docs,
metadatas=batch_meta
)
pbar.update(end_idx - i)
print(f"成功將 {total_items} 項數據添加到集合 '{collection_name}'")
return True
except Exception as e:
print(f"創建或填充集合 '{collection_name}' 時出錯: {e}")
import traceback
traceback.print_exc()
return False
def swap_collections(
client,
original_collection: str,
temp_collection: str,
confirm_dangerous: bool = False,
dry_run: bool = False,
embedding_func = None # 添加嵌入函數作為參數
) -> bool:
"""替換集合(刪除原始集合,將臨時集合重命名為原始集合名)
Args:
client: ChromaDB客戶端
original_collection: 原始集合名稱
temp_collection: 臨時集合名稱
confirm_dangerous: 是否確認危險操作
dry_run: 是否只模擬執行
embedding_func: 嵌入函數用於創建新集合
Returns:
成功返回True否則返回False
"""
if dry_run:
print(f"[模擬] 將替換集合: 刪除 '{original_collection}',重命名 '{temp_collection}''{original_collection}'")
return True
try:
# 檢查是否有確認標誌
if not confirm_dangerous:
response = input(f"警告: 即將刪除集合 '{original_collection}' 並用 '{temp_collection}' 替換它。確認操作? (y/N): ")
if response.lower() != 'y':
print("操作已取消")
return False
# 檢查兩個集合是否都存在
all_collections = [col.name for col in client.list_collections()]
if original_collection not in all_collections:
print(f"錯誤: 原始集合 '{original_collection}' 不存在")
return False
if temp_collection not in all_collections:
print(f"錯誤: 臨時集合 '{temp_collection}' 不存在")
return False
# 獲取臨時集合的所有數據
# 在刪除原始集合之前先獲取臨時集合的所有數據
print(f"獲取臨時集合 '{temp_collection}' 的數據...")
temp_collection_obj = client.get_collection(temp_collection)
temp_data = temp_collection_obj.get(include=["documents", "metadatas"])
# 刪除原始集合
print(f"刪除原始集合 '{original_collection}'...")
client.delete_collection(original_collection)
# 創建一個同名的新集合(與原始集合同名)
print(f"創建新集合 '{original_collection}'...")
# 使用傳入的嵌入函數或臨時集合的嵌入函數
embedding_function = embedding_func or temp_collection_obj._embedding_function
# 創建新的集合
original_collection_obj = client.create_collection(
name=original_collection,
embedding_function=embedding_function
)
# 將數據添加到新集合
if temp_data["ids"]:
print(f"{len(temp_data['ids'])} 項數據從臨時集合複製到新集合...")
# 處理可能的None值
processed_docs = []
for doc in temp_data["documents"]:
if doc is None:
processed_docs.append("")
else:
processed_docs.append(doc)
# 使用分批方式添加數據以避免潛在的大數據問題
batch_size = 100
for i in range(0, len(temp_data["ids"]), batch_size):
end = min(i + batch_size, len(temp_data["ids"]))
original_collection_obj.add(
ids=temp_data["ids"][i:end],
documents=processed_docs[i:end],
metadatas=temp_data["metadatas"][i:end] if temp_data["metadatas"] else None
)
# 刪除臨時集合
print(f"刪除臨時集合 '{temp_collection}'...")
client.delete_collection(temp_collection)
print(f"成功用重新嵌入的數據替換集合 '{original_collection}'")
return True
except Exception as e:
print(f"替換集合時出錯: {e}")
import traceback
traceback.print_exc()
return False
def process_collection(
client,
collection_name: str,
embedding_func,
temp_suffix: str,
batch_size: int,
confirm_dangerous: bool,
dry_run: bool
) -> bool:
"""處理一個集合的完整流程
Args:
client: ChromaDB客戶端
collection_name: 要處理的集合名稱
embedding_func: 新的嵌入函數
temp_suffix: 臨時集合的後綴
batch_size: 批處理大小
confirm_dangerous: 是否確認危險操作
dry_run: 是否只模擬執行
Returns:
處理成功返回True否則返回False
"""
print(f"\n{'=' * 60}")
print(f"處理集合: '{collection_name}'")
print(f"{'=' * 60}")
# 暫時集合名稱
temp_collection_name = f"{collection_name}{temp_suffix}"
# 1. 獲取原始集合的數據
data = fetch_collection_data(client, collection_name, batch_size)
if not data["ids"]:
print(f"集合 '{collection_name}' 為空或不存在,跳過")
return True
# 2. 創建臨時集合並使用新的嵌入模型填充數據
success = create_and_populate_collection(
client,
temp_collection_name,
data,
embedding_func,
batch_size,
dry_run
)
if not success:
print(f"創建臨時集合 '{temp_collection_name}' 失敗,跳過替換")
return False
# 3. 替換原始集合
success = swap_collections(
client,
collection_name,
temp_collection_name,
confirm_dangerous,
dry_run,
embedding_func # 添加嵌入函數作為參數
)
return success
def main():
"""主函數"""
args = parse_args()
# 獲取ChromaDB目錄
chroma_dir = getattr(config, "CHROMA_DATA_DIR", "chroma_data")
print(f"使用ChromaDB目錄: {chroma_dir}")
# 備份數據庫(如果請求)
if args.backup:
backup_chroma_directory(chroma_dir)
# 創建ChromaDB客戶端
try:
client = chromadb.PersistentClient(path=chroma_dir)
except Exception as e:
print(f"錯誤: 無法連接到ChromaDB: {e}")
sys.exit(1)
# 創建嵌入函數
embedding_func = create_embedding_function(args.new_model)
# 確定要處理的集合
if args.collections:
collections_to_process = args.collections
else:
# 使用配置中的默認集合或獲取所有可用集合
default_collections = [
getattr(config, "BOT_MEMORY_COLLECTION", "wolfhart_memory"),
getattr(config, "CONVERSATIONS_COLLECTION", "conversations"),
getattr(config, "PROFILES_COLLECTION", "user_profiles")
]
collections_to_process = get_collection_names(client, default_collections)
# 過濾掉已經是臨時集合的集合名稱
filtered_collections = []
for collection in collections_to_process:
if args.temp_collection_suffix in collection:
print(f"警告: 跳過可能的臨時集合 '{collection}'")
continue
filtered_collections.append(collection)
collections_to_process = filtered_collections
if not collections_to_process:
print("沒有找到可處理的集合。")
sys.exit(0)
print(f"將處理以下集合: {', '.join(collections_to_process)}")
if args.dry_run:
print("注意: 執行為乾運行模式,不會實際修改數據")
# 詢問用戶確認
if not args.confirm_dangerous and not args.dry_run:
confirm = input("這個操作將使用新的嵌入模型重新計算所有數據。繼續? (y/N): ")
if confirm.lower() != 'y':
print("操作已取消")
sys.exit(0)
# 處理每個集合
start_time = time.time()
success_count = 0
for collection_name in collections_to_process:
if process_collection(
client,
collection_name,
embedding_func,
args.temp_collection_suffix,
args.batch_size,
args.confirm_dangerous,
args.dry_run
):
success_count += 1
# 報告結果
elapsed_time = time.time() - start_time
print(f"\n{'=' * 60}")
print(f"處理完成: {success_count}/{len(collections_to_process)} 個集合成功")
print(f"總耗時: {elapsed_time:.2f}")
print(f"{'=' * 60}")
if __name__ == "__main__":
main()

155
simple_bubble_dedup.py Normal file
View File

@ -0,0 +1,155 @@
import os
import json
import collections
import threading
from PIL import Image
import imagehash
import numpy as np
import io
class SimpleBubbleDeduplication:
def __init__(self, storage_file="simple_bubble_dedup.json", max_bubbles=5, threshold=5, hash_size=16):
self.storage_file = storage_file
self.max_bubbles = max_bubbles # Keep the most recent 5 bubbles
self.threshold = threshold # Hash difference threshold (lower values are more strict)
self.hash_size = hash_size # Hash size
self.lock = threading.Lock()
# Use OrderedDict to maintain order
self.recent_bubbles = collections.OrderedDict()
# Load stored bubble hashes
self._load_storage()
def _load_storage(self):
"""Load processed bubble hash values from file"""
if os.path.exists(self.storage_file):
try:
with open(self.storage_file, 'r') as f:
data = json.load(f)
# Convert stored data to OrderedDict and load
self.recent_bubbles.clear()
# Use loaded_count to track loaded items, ensuring we don't exceed max_bubbles
loaded_count = 0
for bubble_id, bubble_data in data.items():
if loaded_count >= self.max_bubbles:
break
self.recent_bubbles[bubble_id] = {
'hash': imagehash.hex_to_hash(bubble_data['hash']),
'sender': bubble_data.get('sender', 'Unknown')
}
loaded_count += 1
print(f"Loaded {len(self.recent_bubbles)} bubble hash records")
except Exception as e:
print(f"Failed to load bubble hash records: {e}")
self.recent_bubbles.clear()
def _save_storage(self):
"""Save bubble hashes to file"""
try:
# Create temporary dictionary for saving
data_to_save = {}
for bubble_id, bubble_data in self.recent_bubbles.items():
data_to_save[bubble_id] = {
'hash': str(bubble_data['hash']),
'sender': bubble_data.get('sender', 'Unknown')
}
with open(self.storage_file, 'w') as f:
json.dump(data_to_save, f, indent=2)
print(f"Saved {len(data_to_save)} bubble hash records")
except Exception as e:
print(f"Failed to save bubble hash records: {e}")
def compute_image_hash(self, bubble_snapshot):
"""Calculate perceptual hash of bubble image"""
try:
# If bubble_snapshot is a PIL.Image object
if isinstance(bubble_snapshot, Image.Image):
img = bubble_snapshot
# If bubble_snapshot is a PyAutoGUI screenshot
elif hasattr(bubble_snapshot, 'save'):
img = bubble_snapshot
# If it's bytes or BytesIO
elif isinstance(bubble_snapshot, (bytes, io.BytesIO)):
img = Image.open(io.BytesIO(bubble_snapshot) if isinstance(bubble_snapshot, bytes) else bubble_snapshot)
# If it's a numpy array
elif isinstance(bubble_snapshot, np.ndarray):
img = Image.fromarray(bubble_snapshot)
else:
print(f"Unrecognized image format: {type(bubble_snapshot)}")
return None
# Calculate perceptual hash
phash = imagehash.phash(img, hash_size=self.hash_size)
return phash
except Exception as e:
print(f"Failed to calculate image hash: {e}")
return None
def generate_bubble_id(self, bubble_region):
"""Generate ID based on bubble region"""
return f"bubble_{bubble_region[0]}_{bubble_region[1]}_{bubble_region[2]}_{bubble_region[3]}"
def is_duplicate(self, bubble_snapshot, bubble_region, sender_name=""):
"""Check if bubble is a duplicate"""
with self.lock:
if bubble_snapshot is None:
return False
# Calculate hash of current bubble
current_hash = self.compute_image_hash(bubble_snapshot)
if current_hash is None:
print("Unable to calculate bubble hash, cannot perform deduplication")
return False
# Generate ID for current bubble
bubble_id = self.generate_bubble_id(bubble_region)
# Check if similar to any known bubbles
for stored_id, bubble_data in self.recent_bubbles.items():
stored_hash = bubble_data['hash']
hash_diff = current_hash - stored_hash
if hash_diff <= self.threshold:
print(f"Detected duplicate bubble (ID: {stored_id}, Hash difference: {hash_diff})")
if sender_name:
print(f"Sender: {sender_name}, Recorded sender: {bubble_data.get('sender', 'Unknown')}")
return True
# Not a duplicate, add to recent bubbles list
self.recent_bubbles[bubble_id] = {
'hash': current_hash,
'sender': sender_name
}
# If exceeding maximum count, remove oldest item
while len(self.recent_bubbles) > self.max_bubbles:
self.recent_bubbles.popitem(last=False) # Remove first item (oldest)
self._save_storage()
return False
def clear_all(self):
"""Clear all records"""
with self.lock:
count = len(self.recent_bubbles)
self.recent_bubbles.clear()
self._save_storage()
print(f"Cleared all {count} bubble records")
return count
def save_debug_image(self, bubble_snapshot, bubble_id, hash_value):
"""Save debug image (optional feature)"""
try:
debug_dir = "bubble_debug"
if not os.path.exists(debug_dir):
os.makedirs(debug_dir)
# Save original image
img_path = os.path.join(debug_dir, f"{bubble_id}_{hash_value}.png")
bubble_snapshot.save(img_path)
print(f"Saved debug image: {img_path}")
except Exception as e:
print(f"Failed to save debug image: {e}")

BIN
templates/chat_option.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.0 KiB

View File

@ -412,30 +412,46 @@ class ChromaDBBackup:
shutil.rmtree(temp_dir) shutil.rmtree(temp_dir)
return False return False
def schedule_backup(self, interval: str, description: str = "", keep_count: int = 0) -> bool: def schedule_backup(self, interval: str, description: str = "", keep_count: int = 0, at_time: Optional[str] = None) -> bool:
"""排程定期備份 """排程定期備份
interval: 備份間隔 - daily, weekly, hourly, 自定義 cron 表達式 interval: 備份間隔 - daily, weekly, hourly
description: 備份描述 description: 備份描述
keep_count: 保留的備份數量0表示不限制 keep_count: 保留的備份數量0表示不限制
at_time: 執行的時間格式 "HH:MM" (例如 "14:30")僅對 daily, weekly, monthly 有效
""" """
job_id = f"scheduled_{interval}_{int(time.time())}" job_id = f"scheduled_{interval}_{int(time.time())}"
# 驗證 at_time 格式
if at_time:
try:
time.strptime(at_time, "%H:%M")
except ValueError:
self.logger.error(f"無效的時間格式: {at_time}. 請使用 HH:MM 格式.")
return False
# 如果是每小時備份,則忽略 at_time
if interval == "hourly":
at_time = None
try: try:
# 根據間隔設置排程 # 根據間隔設置排程
if interval == "hourly": if interval == "hourly":
schedule.every().hour.do(self._run_scheduled_backup, job_id=job_id, description=description, interval=interval) schedule.every().hour.do(self._run_scheduled_backup, job_id=job_id, description=description, interval=interval, at_time=at_time)
elif interval == "daily": elif interval == "daily":
schedule.every().day.at("00:00").do(self._run_scheduled_backup, job_id=job_id, description=description, interval=interval) schedule_time = at_time if at_time else "00:00"
schedule.every().day.at(schedule_time).do(self._run_scheduled_backup, job_id=job_id, description=description, interval=interval, at_time=at_time)
elif interval == "weekly": elif interval == "weekly":
schedule.every().monday.at("00:00").do(self._run_scheduled_backup, job_id=job_id, description=description, interval=interval) schedule_time = at_time if at_time else "00:00"
schedule.every().monday.at(schedule_time).do(self._run_scheduled_backup, job_id=job_id, description=description, interval=interval, at_time=at_time)
elif interval == "monthly": elif interval == "monthly":
schedule_time = at_time if at_time else "00:00"
# 每月1日執行 # 每月1日執行
schedule.every().day.at("00:00").do(self._check_monthly_schedule, job_id=job_id, description=description, interval=interval) schedule.every().day.at(schedule_time).do(self._check_monthly_schedule, job_id=job_id, description=description, interval=interval, at_time=at_time)
else: else:
# 自定義間隔 - 直接使用字符串作為cron表達式
self.logger.warning(f"不支援的排程間隔: {interval},改用每日排程") self.logger.warning(f"不支援的排程間隔: {interval},改用每日排程")
schedule.every().day.at("00:00").do(self._run_scheduled_backup, job_id=job_id, description=description, interval="daily") schedule_time = at_time if at_time else "00:00"
schedule.every().day.at(schedule_time).do(self._run_scheduled_backup, job_id=job_id, description=description, interval="daily", at_time=at_time)
# 存儲排程任務信息 # 存儲排程任務信息
self.scheduled_jobs[job_id] = { self.scheduled_jobs[job_id] = {
@ -443,10 +459,11 @@ class ChromaDBBackup:
"description": description, "description": description,
"created": datetime.datetime.now(), "created": datetime.datetime.now(),
"keep_count": keep_count, "keep_count": keep_count,
"next_run": self._get_next_run_time(interval) "at_time": at_time, # 新增
"next_run": self._get_next_run_time(interval, at_time)
} }
self.logger.info(f"已排程 {interval} 備份任務ID: {job_id}") self.logger.info(f"已排程 {interval} 備份 (時間: {at_time if at_time else '預設'})任務ID: {job_id}")
return True return True
except Exception as e: except Exception as e:
@ -459,32 +476,66 @@ class ChromaDBBackup:
return self._run_scheduled_backup(job_id, description, interval) return self._run_scheduled_backup(job_id, description, interval)
return None return None
def _get_next_run_time(self, interval): def _get_next_run_time(self, interval: str, at_time: Optional[str] = None) -> datetime.datetime:
"""獲取下次執行時間""" """獲取下次執行時間"""
now = datetime.datetime.now() now = datetime.datetime.now()
target_hour, target_minute = 0, 0
if at_time:
try:
t = time.strptime(at_time, "%H:%M")
target_hour, target_minute = t.tm_hour, t.tm_min
except ValueError:
# 如果格式錯誤,使用預設時間
pass
if interval == "hourly": if interval == "hourly":
return now.replace(minute=0, second=0) + datetime.timedelta(hours=1) # 每小時任務,忽略 at_time在下一個整點執行
next_run_time = now.replace(minute=0, second=0, microsecond=0) + datetime.timedelta(hours=1)
# 如果計算出的時間已過,則再加一小時
if next_run_time <= now:
next_run_time += datetime.timedelta(hours=1)
return next_run_time
elif interval == "daily": elif interval == "daily":
return now.replace(hour=0, minute=0, second=0) + datetime.timedelta(days=1) next_run_time = now.replace(hour=target_hour, minute=target_minute, second=0, microsecond=0)
if next_run_time <= now: # 如果今天的時間已過,則設為明天
next_run_time += datetime.timedelta(days=1)
return next_run_time
elif interval == "weekly": elif interval == "weekly":
# 計算下個星期一 # 計算下個星期一
days_ahead = 0 - now.weekday() next_run_time = now.replace(hour=target_hour, minute=target_minute, second=0, microsecond=0)
if days_ahead <= 0: days_ahead = 0 - next_run_time.weekday() # 0 is Monday
if days_ahead <= 0: # Target day already happened this week
days_ahead += 7 days_ahead += 7
return now.replace(hour=0, minute=0, second=0) + datetime.timedelta(days=days_ahead) next_run_time += datetime.timedelta(days=days_ahead)
# 如果計算出的時間已過 (例如今天是星期一,但設定的時間已過),則設為下下星期一
if next_run_time <= now:
next_run_time += datetime.timedelta(weeks=1)
return next_run_time
elif interval == "monthly": elif interval == "monthly":
# 計算下個月1日 # 計算下個月1日
next_run_time = now.replace(day=1, hour=target_hour, minute=target_minute, second=0, microsecond=0)
if now.month == 12: if now.month == 12:
next_month = now.replace(year=now.year+1, month=1, day=1, hour=0, minute=0, second=0) next_run_time = next_run_time.replace(year=now.year + 1, month=1)
else: else:
next_month = now.replace(month=now.month+1, day=1, hour=0, minute=0, second=0) next_run_time = next_run_time.replace(month=now.month + 1)
return next_month
# 如果計算出的時間已過 (例如今天是1號但設定的時間已過)則設為下下個月1號
if next_run_time <= now:
if next_run_time.month == 12:
next_run_time = next_run_time.replace(year=next_run_time.year + 1, month=1)
else:
next_run_time = next_run_time.replace(month=next_run_time.month + 1)
return next_run_time
# 默認返回明天 # 默認返回明天
return now.replace(hour=0, minute=0, second=0) + datetime.timedelta(days=1) default_next_run = now.replace(hour=target_hour, minute=target_minute, second=0, microsecond=0) + datetime.timedelta(days=1)
return default_next_run
def _run_scheduled_backup(self, job_id, description, interval):
def _run_scheduled_backup(self, job_id: str, description: str, interval: str, at_time: Optional[str] = None):
"""執行排程備份任務""" """執行排程備份任務"""
job_info = self.scheduled_jobs.get(job_id) job_info = self.scheduled_jobs.get(job_id)
if not job_info: if not job_info:
@ -493,7 +544,7 @@ class ChromaDBBackup:
try: try:
# 更新下次執行時間 # 更新下次執行時間
self.scheduled_jobs[job_id]["next_run"] = self._get_next_run_time(interval) self.scheduled_jobs[job_id]["next_run"] = self._get_next_run_time(interval, at_time)
# 執行備份 # 執行備份
timestamp = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S") timestamp = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
@ -693,7 +744,8 @@ class ChromaDBBackup:
"description": job_data["description"], "description": job_data["description"],
"created": job_data["created"].strftime("%Y-%m-%d %H:%M:%S"), "created": job_data["created"].strftime("%Y-%m-%d %H:%M:%S"),
"next_run": job_data["next_run"].strftime("%Y-%m-%d %H:%M:%S") if job_data["next_run"] else "未知", "next_run": job_data["next_run"].strftime("%Y-%m-%d %H:%M:%S") if job_data["next_run"] else "未知",
"keep_count": job_data["keep_count"] "keep_count": job_data["keep_count"],
"at_time": job_data.get("at_time", "N/A") # 新增
} }
jobs_info.append(job_info) jobs_info.append(job_info)
@ -967,12 +1019,14 @@ class ChromaDBBackupUI:
jobs_frame = ttk.Frame(schedule_frame) jobs_frame = ttk.Frame(schedule_frame)
jobs_frame.pack(fill=BOTH, expand=YES) jobs_frame.pack(fill=BOTH, expand=YES)
columns = ("interval", "next_run") columns = ("interval", "next_run", "at_time") # 新增 at_time
self.jobs_tree = ttk.Treeview(jobs_frame, columns=columns, show="headings", height=5) self.jobs_tree = ttk.Treeview(jobs_frame, columns=columns, show="headings", height=5)
self.jobs_tree.heading("interval", text="間隔") self.jobs_tree.heading("interval", text="間隔")
self.jobs_tree.heading("next_run", text="下次執行") self.jobs_tree.heading("next_run", text="下次執行")
self.jobs_tree.heading("at_time", text="執行時間") # 新增
self.jobs_tree.column("interval", width=100) self.jobs_tree.column("interval", width=100)
self.jobs_tree.column("next_run", width=150) self.jobs_tree.column("next_run", width=150)
self.jobs_tree.column("at_time", width=80) # 新增
scrollbar = ttk.Scrollbar(jobs_frame, orient=VERTICAL, command=self.jobs_tree.yview) scrollbar = ttk.Scrollbar(jobs_frame, orient=VERTICAL, command=self.jobs_tree.yview)
self.jobs_tree.configure(yscrollcommand=scrollbar.set) self.jobs_tree.configure(yscrollcommand=scrollbar.set)
@ -1164,7 +1218,8 @@ class ChromaDBBackupUI:
iid=job["id"], # 使用任務ID作為樹項目ID iid=job["id"], # 使用任務ID作為樹項目ID
values=( values=(
f"{job['interval']} ({job['description']})", f"{job['interval']} ({job['description']})",
job["next_run"] job["next_run"],
job.get("at_time", "N/A") # 新增
) )
) )
@ -1730,7 +1785,7 @@ class ChromaDBBackupUI:
# 創建對話框 # 創建對話框
dialog = tk.Toplevel(self.root) dialog = tk.Toplevel(self.root)
dialog.title("排程備份") dialog.title("排程備份")
dialog.geometry("450x450") # 增加高度確保所有元素可見 dialog.geometry("450x550") # 增加高度以容納時間選擇器
dialog.resizable(False, False) dialog.resizable(False, False)
dialog.grab_set() dialog.grab_set()
@ -1747,17 +1802,17 @@ class ChromaDBBackupUI:
# 間隔選擇 # 間隔選擇
interval_frame = ttk.Frame(main_frame) interval_frame = ttk.Frame(main_frame)
interval_frame.pack(fill=X, pady=(0, 15)) interval_frame.pack(fill=X, pady=(0, 10)) # 減少 pady
ttk.Label(interval_frame, text="備份間隔:").pack(anchor=W) ttk.Label(interval_frame, text="備份間隔:").pack(anchor=W)
interval_var = tk.StringVar(value="daily") interval_var = tk.StringVar(value="daily")
intervals = [ intervals = [
("每小時", "hourly"), ("每小時 (忽略時間設定)", "hourly"), # 提示每小時忽略時間
("每天", "daily"), ("每天", "daily"),
("每週", "weekly"), ("每週 (週一)", "weekly"), # 提示每週預設為週一
("每月", "monthly") ("每月 (1號)", "monthly") # 提示每月預設為1號
] ]
for text, value in intervals: for text, value in intervals:
@ -1766,17 +1821,50 @@ class ChromaDBBackupUI:
text=text, text=text,
variable=interval_var, variable=interval_var,
value=value value=value
).pack(anchor=W, padx=(20, 0), pady=2) ).pack(anchor=W, padx=(20, 0), pady=1) # 減少 pady
# 時間選擇 (小時和分鐘)
time_frame = ttk.Frame(main_frame)
time_frame.pack(fill=X, pady=(5, 10)) # 減少 pady
ttk.Label(time_frame, text="執行時間 (HH:MM):").pack(side=LEFT, anchor=W)
hour_var = tk.StringVar(value="00")
minute_var = tk.StringVar(value="00")
# 小時 Spinbox
ttk.Spinbox(
time_frame,
from_=0,
to=23,
textvariable=hour_var,
width=3,
format="%02.0f" # 格式化為兩位數
).pack(side=LEFT, padx=(5, 0))
ttk.Label(time_frame, text=":").pack(side=LEFT, padx=2)
# 分鐘 Spinbox
ttk.Spinbox(
time_frame,
from_=0,
to=59,
textvariable=minute_var,
width=3,
format="%02.0f" # 格式化為兩位數
).pack(side=LEFT, padx=(0, 5))
ttk.Label(time_frame, text="(每小時排程將忽略此設定)").pack(side=LEFT, padx=(5,0), anchor=W)
# 描述 # 描述
ttk.Label(main_frame, text="備份描述:").pack(anchor=W, pady=(0, 5)) ttk.Label(main_frame, text="備份描述:").pack(anchor=W, pady=(0, 5))
description_var = tk.StringVar(value="排程備份") description_var = tk.StringVar(value="排程備份")
ttk.Entry(main_frame, textvariable=description_var, width=40).pack(fill=X, pady=(0, 15)) ttk.Entry(main_frame, textvariable=description_var, width=40).pack(fill=X, pady=(0, 10)) # 減少 pady
# 保留數量 # 保留數量
keep_frame = ttk.Frame(main_frame) keep_frame = ttk.Frame(main_frame)
keep_frame.pack(fill=X, pady=(0, 15)) keep_frame.pack(fill=X, pady=(0, 10)) # 減少 pady
ttk.Label(keep_frame, text="最多保留備份數量:").pack(side=LEFT) ttk.Label(keep_frame, text="最多保留備份數量:").pack(side=LEFT)
@ -1795,13 +1883,12 @@ class ChromaDBBackupUI:
).pack(side=LEFT, padx=(5, 0)) ).pack(side=LEFT, padx=(5, 0))
# 分隔線 # 分隔線
ttk.Separator(main_frame, orient=HORIZONTAL).pack(fill=X, pady=15) ttk.Separator(main_frame, orient=HORIZONTAL).pack(fill=X, pady=10) # 減少 pady
# 底部按鈕區 - 使用標準按鈕並確保可見性 # 底部按鈕區
btn_frame = ttk.Frame(main_frame) btn_frame = ttk.Frame(main_frame)
btn_frame.pack(fill=X, pady=(10, 5)) btn_frame.pack(fill=X, pady=(5, 0)) # 減少 pady
# 取消按鈕 - 使用標準樣式
cancel_btn = ttk.Button( cancel_btn = ttk.Button(
btn_frame, btn_frame,
text="取消", text="取消",
@ -1810,7 +1897,6 @@ class ChromaDBBackupUI:
) )
cancel_btn.pack(side=LEFT, padx=(0, 10)) cancel_btn.pack(side=LEFT, padx=(0, 10))
# 確認按鈕 - 使用標準樣式,避免自定義樣式可能的問題
create_btn = ttk.Button( create_btn = ttk.Button(
btn_frame, btn_frame,
text="加入排程", text="加入排程",
@ -1819,22 +1905,22 @@ class ChromaDBBackupUI:
interval_var.get(), interval_var.get(),
description_var.get(), description_var.get(),
keep_count_var.get(), keep_count_var.get(),
f"{hour_var.get()}:{minute_var.get()}", # 組合時間字串
dialog dialog
) )
) )
create_btn.pack(side=LEFT) create_btn.pack(side=LEFT)
# 額外提示以確保用戶知道如何完成操作
note_frame = ttk.Frame(main_frame) note_frame = ttk.Frame(main_frame)
note_frame.pack(fill=X, pady=(15, 0)) note_frame.pack(fill=X, pady=(10, 0)) # 減少 pady
ttk.Label( ttk.Label(
note_frame, note_frame,
text="請確保點擊「加入排程」按鈕完成設置", text="請確保點擊「加入排程」按鈕完成設置",
foreground="blue" foreground="blue"
).pack() ).pack()
def create_schedule(self, interval, description, keep_count_str, dialog): def create_schedule(self, interval, description, keep_count_str, at_time_str, dialog):
"""創建備份排程""" """創建備份排程"""
dialog.destroy() dialog.destroy()
@ -1843,15 +1929,26 @@ class ChromaDBBackupUI:
except ValueError: except ValueError:
keep_count = 0 keep_count = 0
success = self.backup.schedule_backup(interval, description, keep_count) # 驗證時間格式
try:
time.strptime(at_time_str, "%H:%M")
except ValueError:
messagebox.showerror("錯誤", f"無效的時間格式: {at_time_str}. 請使用 HH:MM 格式.")
self.status_var.set("創建排程失敗: 無效的時間格式")
return
# 如果是每小時排程,則 at_time 設為 None
effective_at_time = at_time_str if interval != "hourly" else None
success = self.backup.schedule_backup(interval, description, keep_count, effective_at_time)
if success: if success:
self.status_var.set(f"已創建 {interval} 備份排程") self.status_var.set(f"已創建 {interval} 備份排程 (時間: {effective_at_time if effective_at_time else '每小時'})")
self.refresh_scheduled_jobs() self.refresh_scheduled_jobs()
messagebox.showinfo("成功", f"已成功創建 {interval} 備份排程") messagebox.showinfo("成功", f"已成功創建 {interval} 備份排程 (時間: {effective_at_time if effective_at_time else '每小時'})")
else: else:
self.status_var.set("創建排程失敗") self.status_var.set("創建排程失敗")
messagebox.showerror("錯誤", "無法創建備份排程") messagebox.showerror("錯誤", "無法創建備份排程,請檢查日誌。")
def quick_schedule(self, interval): def quick_schedule(self, interval):
"""快速創建排程備份""" """快速創建排程備份"""
@ -1931,7 +2028,8 @@ class ChromaDBBackupUI:
success = self.backup._run_scheduled_backup( success = self.backup._run_scheduled_backup(
job_id, job_id,
job_info["description"], job_info["description"],
job_info["interval"] job_info["interval"],
job_info.get("at_time") # 傳遞 at_time
) )
self.root.after(0, lambda: self.finalize_job_execution(success)) self.root.after(0, lambda: self.finalize_job_execution(success))
@ -1971,7 +2069,7 @@ class ChromaDBBackupUI:
).pack(anchor=W, pady=(0, 15)) ).pack(anchor=W, pady=(0, 15))
# 創建表格 # 創建表格
columns = ("id", "interval", "description", "next_run", "keep_count") columns = ("id", "interval", "description", "next_run", "keep_count", "at_time") # 新增 at_time
tree = ttk.Treeview(frame, columns=columns, show="headings", height=10) tree = ttk.Treeview(frame, columns=columns, show="headings", height=10)
tree.heading("id", text="任務ID") tree.heading("id", text="任務ID")
@ -1979,12 +2077,14 @@ class ChromaDBBackupUI:
tree.heading("description", text="描述") tree.heading("description", text="描述")
tree.heading("next_run", text="下次執行") tree.heading("next_run", text="下次執行")
tree.heading("keep_count", text="保留數量") tree.heading("keep_count", text="保留數量")
tree.heading("at_time", text="執行時間") # 新增
tree.column("id", width=150) tree.column("id", width=120)
tree.column("interval", width=80) tree.column("interval", width=70)
tree.column("description", width=150) tree.column("description", width=120)
tree.column("next_run", width=150) tree.column("next_run", width=130)
tree.column("keep_count", width=80) tree.column("keep_count", width=70)
tree.column("at_time", width=70) # 新增
# 添加數據 # 添加數據
for job in jobs: for job in jobs:
@ -1995,7 +2095,8 @@ class ChromaDBBackupUI:
job["interval"], job["interval"],
job["description"], job["description"],
job["next_run"], job["next_run"],
job["keep_count"] job["keep_count"],
job.get("at_time", "N/A") # 新增
) )
) )
@ -2346,4 +2447,4 @@ def main():
root.mainloop() root.mainloop()
if __name__ == "__main__": if __name__ == "__main__":
main() main()

View File

@ -3,6 +3,7 @@ import tkinter as tk
from tkinter import filedialog, messagebox from tkinter import filedialog, messagebox
import json import json
import chromadb import chromadb
from chromadb.utils import embedding_functions # 新增導入
import datetime import datetime
import pandas as pd import pandas as pd
import threading import threading
@ -15,6 +16,8 @@ from ttkbootstrap.scrolled import ScrolledFrame
import numpy as np import numpy as np
import logging import logging
from typing import List, Dict, Any, Optional, Union, Tuple from typing import List, Dict, Any, Optional, Union, Tuple
import inspect # 用於檢查函數簽名,判斷是否支持混合搜索
import re # 新增導入 for ID parsing in UI
class ChromaDBReader: class ChromaDBReader:
"""ChromaDB備份讀取器的主數據模型""" """ChromaDB備份讀取器的主數據模型"""
@ -28,6 +31,9 @@ class ChromaDBReader:
self.query_results = [] # 當前查詢結果 self.query_results = [] # 當前查詢結果
self.chroma_client = None # ChromaDB客戶端 self.chroma_client = None # ChromaDB客戶端
self.selected_embedding_model_name = "default" # 用於查詢的嵌入模型
self.query_embedding_function = None # 實例化的查詢嵌入函數, None 表示使用集合內部預設
# 設置日誌 # 設置日誌
logging.basicConfig( logging.basicConfig(
level=logging.INFO, level=logging.INFO,
@ -118,6 +124,41 @@ class ChromaDBReader:
self.chroma_client = None self.chroma_client = None
self.collection_names = [] self.collection_names = []
return False return False
def set_query_embedding_model(self, model_name: str):
"""設置查詢時使用的嵌入模型"""
self.selected_embedding_model_name = model_name
if model_name == "default":
self.query_embedding_function = None # 表示使用集合的內部嵌入函數
self.logger.info("查詢將使用集合內部嵌入模型。")
elif model_name == "all-MiniLM-L6-v2":
try:
# 注意: sentence-transformers 庫需要安裝
self.query_embedding_function = embedding_functions.SentenceTransformerEmbeddingFunction(model_name="all-MiniLM-L6-v2")
self.logger.info(f"查詢將使用外部嵌入模型: {model_name}")
except Exception as e:
self.logger.error(f"無法加載 SentenceTransformer all-MiniLM-L6-v2: {e}。將使用集合內部模型。")
self.query_embedding_function = None
elif model_name == "paraphrase-multilingual-MiniLM-L12-v2":
try:
# 注意: sentence-transformers 庫需要安裝
self.query_embedding_function = embedding_functions.SentenceTransformerEmbeddingFunction(model_name="paraphrase-multilingual-MiniLM-L12-v2")
self.logger.info(f"查詢將使用外部嵌入模型: {model_name}")
except Exception as e:
self.logger.error(f"無法加載 SentenceTransformer paraphrase-multilingual-MiniLM-L12-v2: {e}。將使用集合內部模型。")
self.query_embedding_function = None
# 添加新的模型支持
elif model_name == "paraphrase-multilingual-mpnet-base-v2":
try:
# 注意: sentence-transformers 庫需要安裝
self.query_embedding_function = embedding_functions.SentenceTransformerEmbeddingFunction(model_name="sentence-transformers/paraphrase-multilingual-mpnet-base-v2")
self.logger.info(f"查詢將使用外部嵌入模型: {model_name}")
except Exception as e:
self.logger.error(f"無法加載 SentenceTransformer paraphrase-multilingual-mpnet-base-v2: {e}。將使用集合內部模型。")
self.query_embedding_function = None
else:
self.logger.warning(f"未知的查詢嵌入模型: {model_name}, 將使用集合內部模型。")
self.query_embedding_function = None
def load_collection(self, collection_name: str) -> bool: def load_collection(self, collection_name: str) -> bool:
"""加載指定的集合""" """加載指定的集合"""
@ -125,6 +166,9 @@ class ChromaDBReader:
return False return False
try: try:
# 獲取集合時,如果需要指定 embedding_function (通常在創建時指定)
# 此處是讀取,所以集合的 embedding_function 已經固定
# 我們將在查詢時使用 self.query_embedding_function 來生成 query_embeddings
self.current_collection = self.chroma_client.get_collection(collection_name) self.current_collection = self.chroma_client.get_collection(collection_name)
self.logger.info(f"已加載集合: {collection_name}") self.logger.info(f"已加載集合: {collection_name}")
return True return True
@ -133,46 +177,220 @@ class ChromaDBReader:
self.current_collection = None self.current_collection = None
return False return False
def execute_query(self, query_text: str, n_results: int = 5) -> List[Dict]: def execute_query(self, query_text: str, n_results: int = 5,
"""執行查詢並返回結果""" query_type: str = "basic",
where: Dict = None,
where_document: Dict = None,
include: List[str] = None,
metadata_filter: Dict = None,
hybrid_alpha: float = None) -> List[Dict]:
"""執行查詢並返回結果
參數:
query_text: 查詢文本
n_results: 返回結果數量
query_type: 查詢類型 (basic, metadata, hybrid, multi_vector)
where: where 過濾條件
where_document: 文檔內容過濾條件
include: 指定包含的文檔 ID
metadata_filter: 元數據過濾條件
hybrid_alpha: 混合搜索的權重參數0-1之間越大越傾向關鍵詞搜索
"""
if not self.current_collection or not query_text: if not self.current_collection or not query_text:
return [] return []
try: try:
results = self.current_collection.query( query_params = {
query_texts=[query_text], "n_results": n_results
n_results=n_results }
)
# 轉換結果為更易用的格式 # 基本查詢處理邏輯
processed_results = [] if query_type == "basic":
for i, (doc_id, document, metadata, distance) in enumerate(zip( query_params["query_texts"] = [query_text]
results['ids'][0], # 多向量查詢(用於比較多個查詢之間的相似性)
results['documents'][0], elif query_type == "multi_vector":
results['metadatas'][0] if 'metadatas' in results and results['metadatas'][0] else [{}] * len(results['ids'][0]), # 支持以 "|||" 或換行符分隔的多個查詢文本
results['distances'][0] if 'distances' in results else [0] * len(results['ids'][0]) if "|||" in query_text:
)): query_texts = [text.strip() for text in query_text.split("|||")]
# 計算相似度分數 (將距離轉換為相似度: 1 - 歸一化距離) else:
# 注意: 根據ChromaDB使用的距離度量可能需要調整 query_texts = [text.strip() for text in query_text.splitlines() if text.strip()]
similarity = 1.0 - min(distance, 1.0) # 確保值在0-1之間 query_params["query_texts"] = query_texts
# 添加其他查詢參數
if where:
query_params["where"] = where
if where_document:
query_params["where_document"] = where_document
if include:
query_params["include"] = include
if metadata_filter:
# 直接將元數據過濾條件轉換為 where 條件
if "where" not in query_params:
query_params["where"] = {}
query_params["where"].update(metadata_filter)
# 混合搜索處理
if query_type == "hybrid" and hybrid_alpha is not None:
# 檢查 ChromaDB 版本是否支持混合搜索
if hasattr(self.current_collection, "query") and "alpha" in inspect.signature(self.current_collection.query).parameters:
query_params["alpha"] = hybrid_alpha
# 混合搜索通常需要 query_texts
if "query_texts" not in query_params:
query_params["query_texts"] = [query_text]
else:
self.logger.warning("當前 ChromaDB 版本不支持混合搜索,將使用基本查詢")
query_type = "basic" # 降級為基本查詢
query_params["query_texts"] = [query_text]
elif query_type == "hybrid" and hybrid_alpha is None:
# 如果是混合搜索但未提供 alpha則默認為基本搜索
self.logger.warning("混合搜索未提供 Alpha 值,將使用基本查詢")
query_type = "basic"
query_params["query_texts"] = [query_text]
# 如果 query_type 不是 multi_vector 且 query_texts 未設置,則設置
if query_type not in ["multi_vector", "hybrid"] and "query_texts" not in query_params:
query_params["query_texts"] = [query_text]
# 如果選擇了外部嵌入模型且不是混合查詢,則生成查詢嵌入
if query_type != "hybrid" and \
"query_texts" in query_params and \
self.query_embedding_function:
processed_results.append({ texts_to_embed = query_params["query_texts"]
"rank": i + 1, try:
"id": doc_id, # self.query_embedding_function 接受 List[str] 返回 List[List[float]]
"document": document, generated_embeddings = self.query_embedding_function(texts_to_embed)
"metadata": metadata,
"similarity": similarity, if generated_embeddings and all(isinstance(emb, list) for emb in generated_embeddings):
"distance": distance query_params["query_embeddings"] = generated_embeddings
}) if "query_texts" in query_params: # 確保它存在才刪除
del query_params["query_texts"]
self.logger.info(f"使用 {self.selected_embedding_model_name} 生成了 {len(generated_embeddings)} 個查詢嵌入。")
else:
self.logger.warning(f"未能使用 {self.selected_embedding_model_name} 為所有查詢文本生成有效嵌入。將回退到使用集合預設嵌入函數進行文本查詢。嵌入結果: {generated_embeddings}")
except Exception as e:
self.logger.error(f"使用 {self.selected_embedding_model_name} 生成查詢嵌入時出錯: {e}。將回退到使用集合預設嵌入函數進行文本查詢。")
# 執行查詢
results = self.current_collection.query(**query_params)
# 處理結果
processed_results = []
# 獲取查詢返回的所有結果列表
ids_list = results.get('ids', [[]])
documents_list = results.get('documents', [[]])
metadatas_list = results.get('metadatas', [[]])
distances_list = results.get('distances', [[]])
# 確保列表長度一致,並為空列表提供默認值
num_queries = len(ids_list)
if not documents_list or len(documents_list) != num_queries:
documents_list = [[] for _ in range(num_queries)]
if not metadatas_list or len(metadatas_list) != num_queries:
metadatas_list = [[{}] * len(ids_list[i]) for i in range(num_queries)]
if not distances_list or len(distances_list) != num_queries:
distances_list = [[0.0] * len(ids_list[i]) for i in range(num_queries)]
# 對於多查詢文本的情況,需要分別處理每個查詢的結果
for query_idx, (ids, documents, metadatas, distances) in enumerate(zip(
ids_list,
documents_list,
metadatas_list,
distances_list
)):
# 處理每個查詢結果
for i, (doc_id, document, metadata, distance) in enumerate(zip(
ids, documents,
metadatas if metadatas else [{}] * len(ids), # 再次確保元數據存在
distances if distances else [0.0] * len(ids) # 再次確保距離存在
)):
# 計算相似度分數
similarity = 1.0 - min(float(distance) if distance is not None else 1.0, 1.0)
result_item = {
"rank": i + 1,
"query_index": query_idx,
"id": doc_id,
"document": document,
"metadata": metadata if metadata else {}, # 確保 metadata 是字典
"similarity": similarity,
"distance": float(distance) if distance is not None else 0.0,
"query_type": query_type
}
if query_type == "hybrid":
result_item["hybrid_alpha"] = hybrid_alpha
processed_results.append(result_item)
self.query_results = processed_results self.query_results = processed_results
self.logger.info(f"查詢完成,找到 {len(processed_results)} 個結果") self.logger.info(f"查詢完成,找到 {len(processed_results)} 個結果,查詢類型: {query_type}")
return processed_results return processed_results
except Exception as e: except Exception as e:
self.logger.error(f"執行查詢時出錯: {str(e)}") self.logger.error(f"執行查詢時出錯: {str(e)}")
self.query_results = [] self.query_results = []
return [] return []
def get_documents_by_ids(self, doc_ids: List[str]) -> List[Dict]:
"""按文檔ID列表獲取文檔"""
if not self.current_collection:
self.logger.warning("沒有選擇集合,無法按 ID 獲取文檔。")
return []
if not doc_ids:
self.logger.warning("未提供文檔 ID。")
return []
try:
results = self.current_collection.get(
ids=doc_ids,
include=["documents", "metadatas"]
)
processed_results = []
retrieved_ids = results.get('ids', [])
retrieved_documents = results.get('documents', [])
retrieved_metadatas = results.get('metadatas', [])
# 創建一個字典以便快速查找已檢索到的文檔信息
found_docs_map = {}
for i, r_id in enumerate(retrieved_ids):
found_docs_map[r_id] = {
"document": retrieved_documents[i] if i < len(retrieved_documents) else None,
"metadata": retrieved_metadatas[i] if i < len(retrieved_metadatas) else {}
}
rank_counter = 1
for original_id in doc_ids: # 遍歷原始請求的ID以保持某種順序感並標記未找到的
if original_id in found_docs_map:
doc_data = found_docs_map[original_id]
if doc_data["document"] is not None:
processed_results.append({
"rank": rank_counter,
"id": original_id,
"document": doc_data["document"],
"metadata": doc_data["metadata"],
"similarity": None, # Not applicable
"distance": None, # Not applicable
"query_type": "id_lookup"
})
rank_counter += 1
else: # ID 存在但文檔為空(理論上不應發生在 get 中,除非 include 設置問題)
self.logger.warning(f"ID {original_id} 找到但文檔內容為空。")
# else: # ID 未在返回結果中找到,可以選擇不添加到 processed_results 或添加一個標記
# self.logger.info(f"ID {original_id} 未在集合中找到。")
self.query_results = processed_results
self.logger.info(f"按 ID 查詢完成,從請求的 {len(doc_ids)} 個ID中實際找到 {len(processed_results)} 個文檔。")
return processed_results
except Exception as e:
self.logger.error(f"按 ID 獲取文檔時出錯: {str(e)}")
# traceback.print_exc() # For debugging
self.query_results = []
return []
def get_collection_info(self, collection_name: str) -> Dict: def get_collection_info(self, collection_name: str) -> Dict:
"""獲取集合的詳細信息""" """獲取集合的詳細信息"""
@ -235,6 +453,16 @@ class ChromaDBReaderUI:
# 設置窗口 # 設置窗口
self.root.title("ChromaDB 備份讀取器") self.root.title("ChromaDB 備份讀取器")
self.root.geometry("1280x800") self.root.geometry("1280x800")
# 初始化嵌入模型相關變量
self.embedding_model_var = tk.StringVar(value="預設 (ChromaDB)") # 顯示名稱
self.embedding_models = {
"預設 (ChromaDB)": "default",
"all-MiniLM-L6-v2 (ST)": "all-MiniLM-L6-v2",
"paraphrase-multilingual-MiniLM-L12-v2 (ST)": "paraphrase-multilingual-MiniLM-L12-v2",
"paraphrase-multilingual-mpnet-base-v2 (ST)": "paraphrase-multilingual-mpnet-base-v2" # 添加新的模型選項
}
self.setup_ui() self.setup_ui()
# 默認主題 # 默認主題
@ -262,9 +490,13 @@ class ChromaDBReaderUI:
# 右側面板 (查詢和結果) # 右側面板 (查詢和結果)
self.right_panel = ttk.Frame(self.main_frame) self.right_panel = ttk.Frame(self.main_frame)
self.right_panel.pack(side=LEFT, fill=BOTH, expand=YES) self.right_panel.pack(side=LEFT, fill=BOTH, expand=YES)
# 設置狀態欄 (提前,以確保 self.status_var 在其他地方使用前已定義)
self.setup_status_bar()
# 設置左側面板 # 設置左側面板
self.setup_directory_frame() self.setup_directory_frame()
self.setup_embedding_model_frame() # 新增嵌入模型選擇框架
self.setup_backups_frame() self.setup_backups_frame()
self.setup_collections_frame() self.setup_collections_frame()
@ -272,9 +504,6 @@ class ChromaDBReaderUI:
self.setup_query_frame() self.setup_query_frame()
self.setup_results_frame() self.setup_results_frame()
# 設置狀態欄
self.setup_status_bar()
# 設置菜單 # 設置菜單
self.setup_menu() self.setup_menu()
@ -314,6 +543,24 @@ class ChromaDBReaderUI:
ttk.Entry(dir_frame, textvariable=self.backups_dir_var).pack(side=LEFT, fill=X, expand=YES) ttk.Entry(dir_frame, textvariable=self.backups_dir_var).pack(side=LEFT, fill=X, expand=YES)
ttk.Button(dir_frame, text="瀏覽", command=self.browse_directory).pack(side=LEFT, padx=(5, 0)) ttk.Button(dir_frame, text="瀏覽", command=self.browse_directory).pack(side=LEFT, padx=(5, 0))
ttk.Button(dir_frame, text="載入", command=self.load_backups_directory).pack(side=LEFT, padx=(5, 0)) ttk.Button(dir_frame, text="載入", command=self.load_backups_directory).pack(side=LEFT, padx=(5, 0))
def setup_embedding_model_frame(self):
"""設置查詢嵌入模型選擇框架"""
embedding_frame = ttk.LabelFrame(self.left_panel, text="查詢嵌入模型", padding=10)
embedding_frame.pack(fill=X, pady=(0, 10))
self.embedding_model_combo = ttk.Combobox(
embedding_frame,
textvariable=self.embedding_model_var,
values=list(self.embedding_models.keys()),
state="readonly"
)
self.embedding_model_combo.pack(fill=X, expand=YES)
self.embedding_model_combo.set(list(self.embedding_models.keys())[0]) # 設置預設顯示值
self.embedding_model_combo.bind("<<ComboboxSelected>>", self.on_embedding_model_changed)
# 初始化Reader中的嵌入模型選擇
self.on_embedding_model_changed()
def setup_backups_frame(self): def setup_backups_frame(self):
"""設置備份列表框架""" """設置備份列表框架"""
@ -388,12 +635,46 @@ class ChromaDBReaderUI:
query_frame = ttk.LabelFrame(self.right_panel, text="查詢", padding=10) query_frame = ttk.LabelFrame(self.right_panel, text="查詢", padding=10)
query_frame.pack(fill=X, pady=(0, 10)) query_frame.pack(fill=X, pady=(0, 10))
# 查詢文本輸入 # 創建一個 Notebook 以包含不同的查詢類型標籤頁
ttk.Label(query_frame, text="查詢文本:").pack(anchor=W) self.query_notebook = ttk.Notebook(query_frame)
self.query_text = tk.Text(query_frame, height=4, width=50) self.query_notebook.pack(fill=X, pady=5)
self.query_text.pack(fill=X, pady=5)
# 查詢參數 # 基本查詢標籤頁
self.basic_query_frame = ttk.Frame(self.query_notebook)
self.query_notebook.add(self.basic_query_frame, text="基本查詢")
# 元數據查詢標籤頁
self.metadata_query_frame = ttk.Frame(self.query_notebook)
self.query_notebook.add(self.metadata_query_frame, text="元數據查詢")
# 混合查詢標籤頁
self.hybrid_query_frame = ttk.Frame(self.query_notebook)
self.query_notebook.add(self.hybrid_query_frame, text="混合查詢")
# 多向量查詢標籤頁
self.multi_vector_frame = ttk.Frame(self.query_notebook)
self.query_notebook.add(self.multi_vector_frame, text="多向量查詢")
# ID 查詢標籤頁 (新增)
self.id_query_frame = ttk.Frame(self.query_notebook)
self.query_notebook.add(self.id_query_frame, text="ID 查詢")
# 設置基本查詢頁面
self.setup_basic_query_tab()
# 設置元數據查詢頁面
self.setup_metadata_query_tab()
# 設置混合查詢頁面
self.setup_hybrid_query_tab()
# 設置多向量查詢頁面
self.setup_multi_vector_tab()
# 設置 ID 查詢頁面 (新增)
self.setup_id_query_tab()
# 查詢參數(共用部分)
params_frame = ttk.Frame(query_frame) params_frame = ttk.Frame(query_frame)
params_frame.pack(fill=X) params_frame.pack(fill=X)
@ -405,9 +686,102 @@ class ChromaDBReaderUI:
ttk.Button( ttk.Button(
query_frame, query_frame,
text="執行查詢", text="執行查詢",
command=self.execute_query, command=self.execute_query, # 注意:這個 execute_query 方法將被新的替換
style="Accent.TButton" style="Accent.TButton"
).pack(pady=10) ).pack(pady=10)
def setup_basic_query_tab(self):
"""設置基本查詢標籤頁"""
ttk.Label(self.basic_query_frame, text="查詢文本:").pack(anchor=W)
self.basic_query_text = tk.Text(self.basic_query_frame, height=4, width=50)
self.basic_query_text.pack(fill=X, pady=5)
def setup_metadata_query_tab(self):
"""設置元數據查詢標籤頁"""
ttk.Label(self.metadata_query_frame, text="查詢文本:").pack(anchor=W)
self.metadata_query_text = tk.Text(self.metadata_query_frame, height=4, width=50)
self.metadata_query_text.pack(fill=X, pady=5)
ttk.Label(self.metadata_query_frame, text="元數據過濾條件 (JSON 格式):").pack(anchor=W)
self.metadata_filter_text = tk.Text(self.metadata_query_frame, height=4, width=50)
self.metadata_filter_text.pack(fill=X, pady=5)
self.metadata_filter_text.insert("1.0", '{"key": "value"}')
# 添加一個幫助按鈕,顯示元數據過濾語法的說明
ttk.Button(
self.metadata_query_frame,
text="?",
width=2,
command=self.show_metadata_help
).pack(anchor=E)
def setup_hybrid_query_tab(self):
"""設置混合查詢標籤頁"""
ttk.Label(self.hybrid_query_frame, text="查詢文本:").pack(anchor=W)
self.hybrid_query_text = tk.Text(self.hybrid_query_frame, height=4, width=50)
self.hybrid_query_text.pack(fill=X, pady=5)
alpha_frame = ttk.Frame(self.hybrid_query_frame)
alpha_frame.pack(fill=X)
ttk.Label(alpha_frame, text="Alpha 值 (0-1):").pack(side=LEFT)
self.hybrid_alpha_var = tk.DoubleVar(value=0.5)
ttk.Scale(
alpha_frame,
from_=0.0, to=1.0,
variable=self.hybrid_alpha_var,
orient=tk.HORIZONTAL,
length=200
).pack(side=LEFT, padx=5, fill=X, expand=YES)
# 創建一個Label來顯示Scale的當前值
self.hybrid_alpha_label = ttk.Label(alpha_frame, text=f"{self.hybrid_alpha_var.get():.2f}")
self.hybrid_alpha_label.pack(side=LEFT)
# 綁定Scale的變動到更新Label的函數
self.hybrid_alpha_var.trace_add("write", lambda *args: self.hybrid_alpha_label.config(text=f"{self.hybrid_alpha_var.get():.2f}"))
ttk.Label(self.hybrid_query_frame, text="注意: Alpha=0 完全使用向量搜索Alpha=1 完全使用關鍵詞搜索").pack(pady=2)
ttk.Label(self.hybrid_query_frame, text="混合查詢將使用集合原始嵌入模型,忽略上方選擇的查詢嵌入模型。", font=("TkDefaultFont", 8)).pack(pady=2)
def setup_multi_vector_tab(self):
"""設置多向量查詢標籤頁"""
ttk.Label(self.multi_vector_frame, text="多個查詢文本 (每行一個,或使用 ||| 分隔):").pack(anchor=W)
self.multi_vector_text = tk.Text(self.multi_vector_frame, height=6, width=50)
self.multi_vector_text.pack(fill=X, pady=5)
self.multi_vector_text.insert("1.0", "查詢文本 1\n|||查詢文本 2\n|||查詢文本 3")
ttk.Label(self.multi_vector_frame, text="用於比較多個查詢之間的相似性").pack(pady=5)
def setup_id_query_tab(self):
"""設置ID查詢標籤頁"""
ttk.Label(self.id_query_frame, text="文檔 ID (每行一個,或用逗號/空格分隔):").pack(anchor=tk.W)
self.id_query_text = tk.Text(self.id_query_frame, height=6, width=50)
self.id_query_text.pack(fill=tk.X, pady=5)
self.id_query_text.insert("1.0", "id1\nid2,id3 id4") # 示例
ttk.Label(self.id_query_frame, text="此查詢將獲取指定ID的文檔忽略上方“結果數量”設置。").pack(pady=5)
def show_metadata_help(self):
"""顯示元數據過濾語法說明"""
help_text = """元數據過濾語法示例:
基本過濾:
{"category": "文章"} # 精確匹配
範圍過濾:
{"date": {"$gt": "2023-01-01"}} # 大於
{"date": {"$lt": "2023-12-31"}} # 小於
{"count": {"$gte": 10}} # 大於等於
{"count": {"$lte": 100}} # 小於等於
多條件過濾:
{"$and": [{"category": "文章"}, {"author": "張三"}]} # AND 條件
{"$or": [{"category": "文章"}, {"category": "新聞"}]} # OR 條件
注意: 此處語法遵循 ChromaDB 的過濾語法非標準 JSON 查詢語法
"""
messagebox.showinfo("元數據過濾語法說明", help_text)
def setup_results_frame(self): def setup_results_frame(self):
"""設置結果顯示框架""" """設置結果顯示框架"""
@ -442,6 +816,26 @@ class ChromaDBReaderUI:
self.status_var = tk.StringVar(value="就緒") self.status_var = tk.StringVar(value="就緒")
status_label = ttk.Label(status_frame, textvariable=self.status_var, relief=tk.SUNKEN, anchor=W) status_label = ttk.Label(status_frame, textvariable=self.status_var, relief=tk.SUNKEN, anchor=W)
status_label.pack(fill=X) status_label.pack(fill=X)
def on_embedding_model_changed(self, event=None):
"""處理查詢嵌入模型選擇變更事件"""
selected_display_name = self.embedding_model_var.get()
model_name_key = self.embedding_models.get(selected_display_name, "default")
if hasattr(self, 'reader') and self.reader:
self.reader.set_query_embedding_model(model_name_key) # 更新Reader中的模型
# 更新狀態欄提示
if model_name_key == "default":
self.status_var.set("查詢將使用集合內部嵌入模型。")
elif self.reader.query_embedding_function: # 檢查模型是否成功加載
self.status_var.set(f"查詢將使用外部模型: {selected_display_name}")
else: # 加載失敗
self.status_var.set(f"模型 {selected_display_name} 加載失敗/無效,將使用集合內部模型。")
else:
# Reader尚未初始化這通常在UI初始化早期發生
# self.reader.set_query_embedding_model 會在 setup_embedding_model_frame 中首次調用時處理
pass
def browse_directory(self): def browse_directory(self):
"""瀏覽選擇備份目錄""" """瀏覽選擇備份目錄"""
@ -527,27 +921,38 @@ class ChromaDBReaderUI:
# 獲取選定項的索引 # 獲取選定項的索引
item_id = selection[0] item_id = selection[0]
item_index = self.backups_tree.index(item_id) # item_index = self.backups_tree.index(item_id) # 這個索引是相對於當前顯示的項目的
# 獲取所有顯示的備份項目 # 直接從 Treeview item 中獲取備份名稱,然後在 self.reader.backups 中查找
visible_items = self.backups_tree.get_children() try:
if item_index >= len(visible_items): backup_name_from_tree = self.backups_tree.item(item_id)["values"][0]
except IndexError:
self.logger.error("無法從 Treeview 獲取備份名稱")
return return
actual_backup_index = -1
for i, backup_info in enumerate(self.reader.backups):
if backup_info["name"] == backup_name_from_tree:
actual_backup_index = i
break
# 查找此顯示項對應的實際備份索引 if actual_backup_index == -1:
backup_name = self.backups_tree.item(visible_items[item_index])["values"][0] self.logger.error(f"在備份列表中未找到名為 {backup_name_from_tree} 的備份")
backup_index = next((i for i, b in enumerate(self.reader.backups) if b["name"] == backup_name), -1)
if backup_index == -1:
return return
# 載入備份 # 載入備份
self.status_var.set(f"正在載入備份: {backup_name}...") self.status_var.set(f"正在載入備份: {backup_name_from_tree}...")
self.root.update_idletasks() self.root.update_idletasks()
# 確保 Reader 中的嵌入模型是最新的 (雖然 on_embedding_model_changed 應該已經處理了)
# selected_display_name = self.embedding_model_var.get()
# model_key = self.embedding_models.get(selected_display_name, "default")
# self.reader.set_query_embedding_model(model_key) # 這行不需要,因為模型選擇是獨立的
def load_backup_thread(): def load_backup_thread():
success = self.reader.load_backup(backup_index) # load_backup 不再需要 embedding_model_name 參數,因為嵌入模型選擇是針對查詢的
self.root.after(0, lambda: self.finalize_backup_loading(success, backup_name)) success = self.reader.load_backup(actual_backup_index)
self.root.after(0, lambda: self.finalize_backup_loading(success, backup_name_from_tree))
threading.Thread(target=load_backup_thread).start() threading.Thread(target=load_backup_thread).start()
@ -618,7 +1023,7 @@ class ChromaDBReaderUI:
# 獲取集合詳細信息並顯示 # 獲取集合詳細信息並顯示
info = self.reader.get_collection_info(collection_name) info = self.reader.get_collection_info(collection_name)
info_text = f"集合: {info['name']}\n文檔數: {info['document_count']}\n向量維度: {info['dimension']}" info_text = f"集合: {info['name']}\n文檔數: {info['document_count']}\n向量維度: {info['dimension']}"
messagebox.showinfo("集合信息", info_text) # messagebox.showinfo("集合信息", info_text) # 暫時註解掉,避免每次選集合都彈窗
else: else:
self.status_var.set(f"載入集合失敗: {collection_name}") self.status_var.set(f"載入集合失敗: {collection_name}")
messagebox.showerror("錯誤", f"無法載入集合: {collection_name}") messagebox.showerror("錯誤", f"無法載入集合: {collection_name}")
@ -629,25 +1034,170 @@ class ChromaDBReaderUI:
messagebox.showinfo("提示", "請先選擇一個集合") messagebox.showinfo("提示", "請先選擇一個集合")
return return
query_text = self.query_text.get("1.0", tk.END).strip() # 根據當前選擇的標籤頁確定查詢類型
if not query_text: try:
messagebox.showinfo("提示", "請輸入查詢文本") current_tab_widget = self.query_notebook.nametowidget(self.query_notebook.select())
return if current_tab_widget == self.basic_query_frame:
current_tab = 0
elif current_tab_widget == self.metadata_query_frame:
current_tab = 1
elif current_tab_widget == self.hybrid_query_frame:
current_tab = 2
elif current_tab_widget == self.multi_vector_frame:
current_tab = 3
elif current_tab_widget == self.id_query_frame: # 新增 ID 查詢頁判斷
current_tab = 4
else:
messagebox.showerror("錯誤", "未知的查詢標籤頁")
return
except tk.TclError: # Notebook可能還沒有任何分頁被選中
messagebox.showerror("錯誤", "請選擇一個查詢類型標籤頁")
return
# 獲取查詢參數
try: try:
n_results = int(self.n_results_var.get()) n_results = int(self.n_results_var.get())
except ValueError: except ValueError:
messagebox.showerror("錯誤", "結果數量必須是整數") messagebox.showerror("錯誤", "結果數量必須是整數")
return return
self.status_var.set("正在執行查詢...") # 執行不同類型的查詢
if current_tab == 0: # 基本查詢
query_text = self.basic_query_text.get("1.0", tk.END).strip()
if not query_text:
messagebox.showinfo("提示", "請輸入查詢文本")
return
self.status_var.set("正在執行基本查詢...")
self.execute_basic_query(query_text, n_results)
elif current_tab == 1: # 元數據查詢
query_text = self.metadata_query_text.get("1.0", tk.END).strip()
metadata_filter_text = self.metadata_filter_text.get("1.0", tk.END).strip()
if not query_text: # 元數據查詢的文本也可以是空的如果只想用metadata_filter
# messagebox.showinfo("提示", "請輸入查詢文本")
# return
pass # 允許空查詢文本
try:
metadata_filter = json.loads(metadata_filter_text) if metadata_filter_text else None
except json.JSONDecodeError:
messagebox.showerror("錯誤", "元數據過濾條件必須是有效的 JSON 格式")
return
if not query_text and not metadata_filter:
messagebox.showinfo("提示", "請輸入查詢文本或元數據過濾條件")
return
self.status_var.set("正在執行元數據查詢...")
self.execute_metadata_query(query_text, n_results, metadata_filter)
elif current_tab == 2: # 混合查詢
query_text = self.hybrid_query_text.get("1.0", tk.END).strip()
hybrid_alpha = self.hybrid_alpha_var.get()
if not query_text:
messagebox.showinfo("提示", "請輸入查詢文本")
return
self.status_var.set("正在執行混合查詢...")
self.execute_hybrid_query(query_text, n_results, hybrid_alpha)
elif current_tab == 3: # 多向量查詢
query_text = self.multi_vector_text.get("1.0", tk.END).strip()
if not query_text:
messagebox.showinfo("提示", "請輸入查詢文本")
return
self.status_var.set("正在執行多向量查詢...")
self.execute_multi_vector_query(query_text, n_results)
elif current_tab == 4: # ID 查詢
id_input_str = self.id_query_text.get("1.0", tk.END).strip()
if not id_input_str:
messagebox.showinfo("提示", "請輸入文檔 ID。")
return
# 解析 ID: 支持逗號、空格、換行符分隔
doc_ids = [id_val.strip() for id_val in re.split(r'[,\s\n]+', id_input_str) if id_val.strip()]
if not doc_ids:
messagebox.showinfo("提示", "未解析到有效的文檔 ID。")
return
self.status_var.set("正在按 ID 獲取文檔...")
self.execute_id_lookup_query(doc_ids)
def execute_basic_query(self, query_text, n_results):
"""執行基本查詢"""
self.status_var.set(f"正在執行基本查詢: {query_text[:30]}...")
self.root.update_idletasks() self.root.update_idletasks()
def query_thread(): def query_thread():
results = self.reader.execute_query(query_text, n_results) results = self.reader.execute_query(
query_text=query_text,
n_results=n_results,
query_type="basic"
)
self.root.after(0, lambda: self.display_results(results)) self.root.after(0, lambda: self.display_results(results))
threading.Thread(target=query_thread).start() threading.Thread(target=query_thread, daemon=True).start()
def execute_metadata_query(self, query_text, n_results, metadata_filter):
"""執行元數據查詢"""
self.status_var.set(f"正在執行元數據查詢: {query_text[:30]}...")
self.root.update_idletasks()
def query_thread():
results = self.reader.execute_query(
query_text=query_text,
n_results=n_results,
query_type="metadata", # 這裡應該是 "metadata" 但後端邏輯會轉為 where
metadata_filter=metadata_filter
)
self.root.after(0, lambda: self.display_results(results))
threading.Thread(target=query_thread, daemon=True).start()
def execute_hybrid_query(self, query_text, n_results, hybrid_alpha):
"""執行混合查詢"""
self.status_var.set(f"正在執行混合查詢 (α={hybrid_alpha:.2f}): {query_text[:30]}...")
self.root.update_idletasks()
def query_thread():
results = self.reader.execute_query(
query_text=query_text,
n_results=n_results,
query_type="hybrid",
hybrid_alpha=hybrid_alpha
)
self.root.after(0, lambda: self.display_results(results))
threading.Thread(target=query_thread, daemon=True).start()
def execute_multi_vector_query(self, query_text, n_results):
"""執行多向量查詢"""
self.status_var.set(f"正在執行多向量查詢: {query_text.splitlines()[0][:30] if query_text.splitlines() else ''}...")
self.root.update_idletasks()
def query_thread():
results = self.reader.execute_query(
query_text=query_text,
n_results=n_results,
query_type="multi_vector"
)
self.root.after(0, lambda: self.display_results(results))
threading.Thread(target=query_thread, daemon=True).start()
def execute_id_lookup_query(self, doc_ids: List[str]):
"""執行ID查找查詢"""
self.status_var.set(f"正在按 ID 獲取 {len(doc_ids)} 個文檔...")
self.root.update_idletasks()
def query_thread():
results = self.reader.get_documents_by_ids(doc_ids)
self.root.after(0, lambda: self.display_results(results))
threading.Thread(target=query_thread, daemon=True).start()
def display_results(self, results): def display_results(self, results):
"""顯示查詢結果""" """顯示查詢結果"""
@ -679,27 +1229,49 @@ class ChromaDBReaderUI:
widget.destroy() widget.destroy()
# 創建表格 # 創建表格
columns = ("rank", "similarity", "id", "document") columns = ("rank", "similarity", "query_type", "id", "document")
tree = ttk.Treeview(self.list_view, columns=columns, show="headings") tree = ttk.Treeview(self.list_view, columns=columns, show="headings")
tree.heading("rank", text="#") tree.heading("rank", text="#")
tree.heading("similarity", text="相似度") tree.heading("similarity", text="相似度")
tree.heading("query_type", text="查詢類型")
tree.heading("id", text="文檔ID") tree.heading("id", text="文檔ID")
tree.heading("document", text="文檔內容") tree.heading("document", text="文檔內容")
tree.column("rank", width=50, anchor=CENTER) tree.column("rank", width=50, anchor=CENTER)
tree.column("similarity", width=100, anchor=CENTER) tree.column("similarity", width=100, anchor=CENTER)
tree.column("id", width=200) tree.column("query_type", width=120, anchor=CENTER) # 調整寬度以適應更長的類型名稱
tree.column("document", width=600) tree.column("id", width=150)
tree.column("document", width=530) # 調整寬度
# 確定查詢類型名稱映射
query_type_names = {
"basic": "基本查詢",
"metadata": "元數據查詢",
"hybrid": "混合查詢",
"multi_vector": "多向量查詢",
"id_lookup": "ID 查詢" # 新增
}
# 添加結果到表格 # 添加結果到表格
for result in results: for result in results:
raw_query_type = result.get("query_type", "basic")
display_query_type = query_type_names.get(raw_query_type, raw_query_type.capitalize())
if raw_query_type == "hybrid" and "hybrid_alpha" in result:
display_query_type += f" (α={result['hybrid_alpha']:.2f})"
if raw_query_type == "multi_vector" and "query_index" in result:
display_query_type += f" (Q{result['query_index']+1})"
similarity_display = f"{result.get('similarity', 0.0):.4f}" if result.get('similarity') is not None else "N/A"
tree.insert( tree.insert(
"", "end", "", "end",
values=( values=(
result["rank"], result.get("rank", "-"),
f"{result['similarity']:.4f}", similarity_display,
result["id"], display_query_type,
result["document"][:100] + ("..." if len(result["document"]) > 100 else "") result.get("id", "N/A"),
result.get("document", "")[:100] + ("..." if len(result.get("document", "")) > 100 else "")
) )
) )
@ -710,7 +1282,6 @@ class ChromaDBReaderUI:
# 雙擊項目顯示完整內容 # 雙擊項目顯示完整內容
tree.bind("<Double-1>", lambda event: self.show_full_document(tree)) tree.bind("<Double-1>", lambda event: self.show_full_document(tree))
# 使用 Frame 容器來實現滾動功能
# 佈局 # 佈局
tree.pack(side=LEFT, fill=BOTH, expand=YES) tree.pack(side=LEFT, fill=BOTH, expand=YES)
scrollbar.pack(side=RIGHT, fill=Y) scrollbar.pack(side=RIGHT, fill=Y)
@ -739,7 +1310,10 @@ class ChromaDBReaderUI:
# 添加文檔信息 # 添加文檔信息
info_text = f"文檔ID: {result['id']}\n" info_text = f"文檔ID: {result['id']}\n"
info_text += f"相似度: {result['similarity']:.4f}\n" if result.get('similarity') is not None:
info_text += f"相似度: {result['similarity']:.4f}\n"
else:
info_text += "相似度: N/A\n"
if result['metadata']: if result['metadata']:
info_text += "\n元數據:\n" info_text += "\n元數據:\n"
@ -806,9 +1380,10 @@ class ChromaDBReaderUI:
title_frame = ttk.Frame(card) title_frame = ttk.Frame(card)
title_frame.pack(fill=X) title_frame.pack(fill=X)
similarity_text_detail = f"{result['similarity']:.4f}" if result.get('similarity') is not None else "N/A"
ttk.Label( ttk.Label(
title_frame, title_frame,
text=f"#{result['rank']} - 相似度: {result['similarity']:.4f}", text=f"#{result['rank']} - 相似度: {similarity_text_detail}",
font=("TkDefaultFont", 10, "bold") font=("TkDefaultFont", 10, "bold")
).pack(side=LEFT) ).pack(side=LEFT)
@ -881,7 +1456,10 @@ class ChromaDBReaderUI:
# 添加文檔信息 # 添加文檔信息
info_text = f"文檔ID: {result['id']}\n" info_text = f"文檔ID: {result['id']}\n"
info_text += f"相似度: {result['similarity']:.4f}\n" if result.get('similarity') is not None:
info_text += f"相似度: {result['similarity']:.4f}\n"
else:
info_text += "相似度: N/A\n"
if result['metadata']: if result['metadata']:
info_text += "\n元數據:\n" info_text += "\n元數據:\n"
@ -1250,4 +1828,4 @@ def main():
root.mainloop() root.mainloop()
if __name__ == "__main__": if __name__ == "__main__":
main() main()

147
tools/color_picker.py Normal file
View File

@ -0,0 +1,147 @@
import cv2
import numpy as np
import pyautogui
def pick_color_fixed():
# 截取游戏区域
screenshot = pyautogui.screenshot(region=(150, 330, 600, 880))
img = np.array(screenshot)
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
# 转为HSV
hsv_img = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# 创建窗口和滑块
cv2.namedWindow('Color Picker')
# 存储采样点
sample_points = []
# 定义鼠标回调函数
def mouse_callback(event, x, y, flags, param):
if event == cv2.EVENT_LBUTTONDOWN:
# 获取点击位置的HSV值
hsv_value = hsv_img[y, x]
sample_points.append(hsv_value)
print(f"添加采样点 #{len(sample_points)}: HSV = {hsv_value}")
# 在图像上显示采样点
cv2.circle(img, (x, y), 3, (0, 255, 0), -1)
cv2.imshow('Color Picker', img)
# 如果有足够多的采样点,计算更精确的范围
if len(sample_points) >= 1:
calculate_range()
def calculate_range():
"""安全计算HSV范围避免溢出"""
if not sample_points:
return
# 转换为numpy数组
points_array = np.array(sample_points)
# 提取各通道的值并安全计算范围
h_values = points_array[:, 0].astype(np.int32) # 转为int32避免溢出
s_values = points_array[:, 1].astype(np.int32)
v_values = points_array[:, 2].astype(np.int32)
# 检查H值是否跨越边界
h_range = np.max(h_values) - np.min(h_values)
h_crosses_boundary = h_range > 90 and len(h_values) > 2
# 计算安全范围值
if h_crosses_boundary:
print("检测到H值可能跨越红色边界(0/180)!")
# 特殊处理跨越边界的H值
# 方法1: 简单方式 - 使用宽范围
h_min = 0
h_max = 179
print(f"使用全H范围: [{h_min}, {h_max}]")
else:
# 正常计算H范围
h_min = max(0, np.min(h_values) - 5)
h_max = min(179, np.max(h_values) + 5)
# 安全计算S和V范围
s_min = max(0, np.min(s_values) - 15)
s_max = min(255, np.max(s_values) + 15)
v_min = max(0, np.min(v_values) - 15)
v_max = min(255, np.max(v_values) + 15)
print("\n推荐的HSV范围:")
print(f"\"hsv_lower\": [{h_min}, {s_min}, {v_min}],")
print(f"\"hsv_upper\": [{h_max}, {s_max}, {v_max}],")
# 显示掩码预览
show_mask_preview(h_min, h_max, s_min, s_max, v_min, v_max)
def show_mask_preview(h_min, h_max, s_min, s_max, v_min, v_max):
"""显示掩码预览,标记检测到的区域"""
# 创建掩码
if h_min <= h_max:
# 标准范围
mask = cv2.inRange(hsv_img,
np.array([h_min, s_min, v_min]),
np.array([h_max, s_max, v_max]))
else:
# 处理H值跨越边界情况
mask1 = cv2.inRange(hsv_img,
np.array([h_min, s_min, v_min]),
np.array([179, s_max, v_max]))
mask2 = cv2.inRange(hsv_img,
np.array([0, s_min, v_min]),
np.array([h_max, s_max, v_max]))
mask = cv2.bitwise_or(mask1, mask2)
# 形态学操作 - 闭运算连接临近区域
kernel = np.ones((5, 5), np.uint8)
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
# 找到连通区域
num_labels, labels, stats, centroids = cv2.connectedComponentsWithStats(mask)
# 创建结果图像
result_img = img.copy()
detected_count = 0
# 处理每个连通区域
for i in range(1, num_labels): # 跳过背景(0)
area = stats[i, cv2.CC_STAT_AREA]
# 面积筛选
if 3000 <= area <= 100000:
detected_count += 1
x = stats[i, cv2.CC_STAT_LEFT]
y = stats[i, cv2.CC_STAT_TOP]
w = stats[i, cv2.CC_STAT_WIDTH]
h = stats[i, cv2.CC_STAT_HEIGHT]
# 绘制区域边框
cv2.rectangle(result_img, (x, y), (x+w, y+h), (0, 255, 0), 2)
# 显示区域ID
cv2.putText(result_img, f"#{i}", (x+5, y+20),
cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
# 显示结果
cv2.imshow('Mask Preview', result_img)
print(f"检测到 {detected_count} 个合适大小的区域")
# 设置鼠标回调
cv2.setMouseCallback('Color Picker', mouse_callback)
# 显示操作说明
print("使用说明:")
print("1. 点击气泡上的多个位置进行采样")
print("2. 程序会自动计算合适的HSV范围")
print("3. 绿色方框表示检测到的区域")
print("4. 按ESC键退出")
print("\n【特别提示】如果气泡混合了红色和紫色可能需要创建两个配置以处理H通道的边界问题")
# 显示图像
cv2.imshow('Color Picker', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
if __name__ == "__main__":
pick_color_fixed()

View File

@ -4,6 +4,8 @@
import pyautogui import pyautogui
import cv2 # opencv-python import cv2 # opencv-python
import numpy as np import numpy as np
import sys # Added for special character handling
import io # Added for special character handling
import pyperclip import pyperclip
import time import time
import os import os
@ -16,12 +18,107 @@ import queue
from typing import List, Tuple, Optional, Dict, Any from typing import List, Tuple, Optional, Dict, Any
import threading # Import threading for Lock if needed, or just use a simple flag import threading # Import threading for Lock if needed, or just use a simple flag
import math # Added for distance calculation in dual method import math # Added for distance calculation in dual method
import time # Ensure time is imported for MessageDeduplication
from simple_bubble_dedup import SimpleBubbleDeduplication
import difflib # Added for text similarity
class MessageDeduplication:
def __init__(self, expiry_seconds=3600): # 1 hour expiry time
self.processed_messages = {} # {message_key: timestamp}
self.expiry_seconds = expiry_seconds
def is_duplicate(self, sender, content):
"""Check if the message is a duplicate within the expiry period using text similarity."""
if not sender or not content:
return False # Missing necessary info, treat as new message
current_time = time.time()
# 遍歷所有已處理的消息
for key, timestamp in list(self.processed_messages.items()):
# 檢查是否過期
if current_time - timestamp >= self.expiry_seconds:
# 從 processed_messages 中移除過期的項目,避免集合在迭代時改變大小
# 但由於我們使用了 list(self.processed_messages.items()),所以這裡可以安全地 continue
# 或者,如果希望立即刪除,則需要不同的迭代策略或在 purge_expired 中處理
continue # 繼續檢查下一個,過期項目由 purge_expired 處理
# 解析之前儲存的發送者和內容
stored_sender, stored_content = key.split(":", 1)
# 檢查發送者是否相同
if sender.lower() == stored_sender.lower():
# Calculate text similarity
similarity = difflib.SequenceMatcher(None, content, stored_content).ratio()
if similarity >= 0.95: # Use 0.95 as threshold
print(f"Deduplicator: Detected similar message (similarity: {similarity:.2f}): {sender} - {content[:20]}...")
return True
# 不是重複消息,儲存它
# 注意:這裡儲存的 content 是原始 content不是 clean_content
message_key = f"{sender.lower()}:{content}"
self.processed_messages[message_key] = current_time
return False
# create_key 方法已不再需要,可以移除
# def create_key(self, sender, content):
# """Create a standardized composite key."""
# # Thoroughly standardize text - remove all whitespace and punctuation, lowercase
# clean_content = ''.join(c.lower() for c in content if c.isalnum())
# clean_sender = ''.join(c.lower() for c in sender if c.isalnum())
# # Truncate content to first 100 chars to prevent overly long keys
# if len(clean_content) > 100:
# clean_content = clean_content[:100]
# return f"{clean_sender}:{clean_content}"
def purge_expired(self):
"""Remove expired message records."""
current_time = time.time()
expired_keys = [k for k, t in self.processed_messages.items()
if current_time - t >= self.expiry_seconds]
for key in expired_keys:
del self.processed_messages[key]
if expired_keys: # Log only if something was purged
print(f"Deduplicator: Purged {len(expired_keys)} expired message records.")
return len(expired_keys)
def clear_all(self):
"""Clear all recorded messages (for F7/F8 functionality)."""
count = len(self.processed_messages)
self.processed_messages.clear()
if count > 0: # Log only if something was cleared
print(f"Deduplicator: Cleared all {count} message records.")
return count
# --- Global Pause Flag --- # --- Global Pause Flag ---
# Using a simple mutable object (list) for thread-safe-like access without explicit lock # Using a simple mutable object (list) for thread-safe-like access without explicit lock
# Or could use threading.Event() # Or could use threading.Event()
monitoring_paused_flag = [False] # List containing a boolean monitoring_paused_flag = [False] # List containing a boolean
# --- Global Error Handling Setup for Text Encoding ---
def handle_text_encoding(text, default_text="[無法處理的文字]"):
"""安全處理任何文字,確保不會因編碼問題而崩潰程序"""
if text is None:
return default_text
try:
# 嘗試使用 utf-8 編碼
return text
except UnicodeEncodeError:
try:
# 嘗試將特殊字符替換為可顯示字符
return text.encode('utf-8', errors='replace').decode('utf-8')
except:
# 最後手段:忽略任何無法處理的字符
try:
return text.encode('utf-8', errors='ignore').decode('utf-8')
except:
return default_text
# --- Color Config Loading --- # --- Color Config Loading ---
def load_bubble_colors(config_path='bubble_colors.json'): def load_bubble_colors(config_path='bubble_colors.json'):
"""Loads bubble color configuration from a JSON file.""" """Loads bubble color configuration from a JSON file."""
@ -120,6 +217,9 @@ PROFILE_OPTION_IMG = os.path.join(TEMPLATE_DIR, "profile_option.png")
COPY_NAME_BUTTON_IMG = os.path.join(TEMPLATE_DIR, "copy_name_button.png") COPY_NAME_BUTTON_IMG = os.path.join(TEMPLATE_DIR, "copy_name_button.png")
SEND_BUTTON_IMG = os.path.join(TEMPLATE_DIR, "send_button.png") SEND_BUTTON_IMG = os.path.join(TEMPLATE_DIR, "send_button.png")
CHAT_INPUT_IMG = os.path.join(TEMPLATE_DIR, "chat_input.png") CHAT_INPUT_IMG = os.path.join(TEMPLATE_DIR, "chat_input.png")
# 新增的模板路徑
CHAT_OPTION_IMG = os.path.join(TEMPLATE_DIR, "chat_option.png")
UPDATE_CONFIRM_IMG = os.path.join(TEMPLATE_DIR, "update_confirm.png")
# State Detection # State Detection
PROFILE_NAME_PAGE_IMG = os.path.join(TEMPLATE_DIR, "Profile_Name_page.png") PROFILE_NAME_PAGE_IMG = os.path.join(TEMPLATE_DIR, "Profile_Name_page.png")
PROFILE_PAGE_IMG = os.path.join(TEMPLATE_DIR, "Profile_page.png") PROFILE_PAGE_IMG = os.path.join(TEMPLATE_DIR, "Profile_page.png")
@ -1068,7 +1168,13 @@ class InteractionModule:
if copied and copied_text and copied_text != "___MCP_CLEAR___": if copied and copied_text and copied_text != "___MCP_CLEAR___":
print(f"Successfully copied text, length: {len(copied_text)}") print(f"Successfully copied text, length: {len(copied_text)}")
return copied_text.strip() # 添加編碼安全處理
try:
safe_text = handle_text_encoding(copied_text.strip())
return safe_text
except Exception as e:
print(f"Error handling copied text encoding: {str(e)}")
return copied_text.strip() # 即使有問題也嘗試返回原始文字
else: else:
print("Error: Copy operation unsuccessful or clipboard content invalid.") print("Error: Copy operation unsuccessful or clipboard content invalid.")
return None return None
@ -1601,13 +1707,22 @@ def perform_state_cleanup(detector: DetectionModule, interactor: InteractionModu
# --- UI Monitoring Loop Function (To be run in a separate thread) --- # --- UI Monitoring Loop Function (To be run in a separate thread) ---
def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queue): def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queue, deduplicator: 'MessageDeduplication'):
""" """
Continuously monitors the UI, detects triggers, performs interactions, Continuously monitors the UI, detects triggers, performs interactions,
puts trigger data into trigger_queue, and processes commands from command_queue. puts trigger data into trigger_queue, and processes commands from command_queue.
""" """
print("\n--- Starting UI Monitoring Loop (Thread) ---") print("\n--- Starting UI Monitoring Loop (Thread) ---")
# --- 初始化氣泡圖像去重系統(新增) ---
bubble_deduplicator = SimpleBubbleDeduplication(
storage_file="simple_bubble_dedup.json",
max_bubbles=4, # 保留最近5個氣泡
threshold=7, # 哈希差異閾值(值越小越嚴格)
hash_size=16 # 哈希大小
)
# --- 初始化氣泡圖像去重系統結束 ---
# --- Initialization (Instantiate modules within the thread) --- # --- Initialization (Instantiate modules within the thread) ---
# --- Template Dictionary Setup (Refactored) --- # --- Template Dictionary Setup (Refactored) ---
essential_templates = { essential_templates = {
@ -1639,7 +1754,9 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
'page_sec': PAGE_SEC_IMG, 'page_str': PAGE_STR_IMG, 'page_sec': PAGE_SEC_IMG, 'page_str': PAGE_STR_IMG,
'dismiss_button': DISMISS_BUTTON_IMG, 'confirm_button': CONFIRM_BUTTON_IMG, 'dismiss_button': DISMISS_BUTTON_IMG, 'confirm_button': CONFIRM_BUTTON_IMG,
'close_button': CLOSE_BUTTON_IMG, 'back_arrow': BACK_ARROW_IMG, 'close_button': CLOSE_BUTTON_IMG, 'back_arrow': BACK_ARROW_IMG,
'reply_button': REPLY_BUTTON_IMG 'reply_button': REPLY_BUTTON_IMG,
# 添加新模板
'chat_option': CHAT_OPTION_IMG, 'update_confirm': UPDATE_CONFIRM_IMG,
} }
legacy_templates = { legacy_templates = {
# Deprecated Keywords (for legacy method fallback) # Deprecated Keywords (for legacy method fallback)
@ -1745,13 +1862,27 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
elif action == 'clear_history': # Added for F7 elif action == 'clear_history': # Added for F7
print("UI Thread: Processing clear_history command.") print("UI Thread: Processing clear_history command.")
recent_texts.clear() recent_texts.clear()
print("UI Thread: recent_texts cleared.") deduplicator.clear_all() # Simultaneously clear deduplication records
# --- 新增:清理氣泡去重記錄 ---
if 'bubble_deduplicator' in locals():
bubble_deduplicator.clear_all()
# --- 清理氣泡去重記錄結束 ---
print("UI Thread: recent_texts and deduplicator records cleared.")
elif action == 'reset_state': # Added for F8 resume elif action == 'reset_state': # Added for F8 resume
print("UI Thread: Processing reset_state command.") print("UI Thread: Processing reset_state command.")
recent_texts.clear() recent_texts.clear()
last_processed_bubble_info = None last_processed_bubble_info = None
print("UI Thread: recent_texts cleared and last_processed_bubble_info reset.") deduplicator.clear_all() # Simultaneously clear deduplication records
# --- 新增:清理氣泡去重記錄 ---
if 'bubble_deduplicator' in locals():
bubble_deduplicator.clear_all()
# --- 清理氣泡去重記錄結束 ---
print("UI Thread: recent_texts, last_processed_bubble_info, and deduplicator records reset.")
else: else:
print(f"UI Thread: Received unknown command: {action}") print(f"UI Thread: Received unknown command: {action}")
@ -1776,6 +1907,19 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
# --- If not paused, proceed with UI Monitoring --- # --- If not paused, proceed with UI Monitoring ---
# print("[DEBUG] UI Loop: Monitoring is active. Proceeding...") # DEBUG REMOVED # print("[DEBUG] UI Loop: Monitoring is active. Proceeding...") # DEBUG REMOVED
# --- 添加檢查 chat_option 狀態 ---
try:
chat_option_locs = detector._find_template('chat_option', confidence=0.8)
if chat_option_locs:
print("UI Thread: Detected chat_option overlay. Pressing ESC to dismiss...")
interactor.press_key('esc')
time.sleep(0.2) # 給一點時間讓界面響應
print("UI Thread: Pressed ESC to dismiss chat_option. Continuing...")
continue # 重新開始循環以確保界面已清除
except Exception as chat_opt_err:
print(f"UI Thread: Error checking for chat_option: {chat_opt_err}")
# 繼續執行,不要中斷主流程
# --- Check for Main Screen Navigation --- # --- Check for Main Screen Navigation ---
# print("[DEBUG] UI Loop: Checking for main screen navigation...") # DEBUG REMOVED # print("[DEBUG] UI Loop: Checking for main screen navigation...") # DEBUG REMOVED
try: try:
@ -1814,8 +1958,19 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
# Use a slightly lower confidence maybe, or state_confidence # Use a slightly lower confidence maybe, or state_confidence
chat_room_locs = detector._find_template('chat_room', confidence=detector.state_confidence) chat_room_locs = detector._find_template('chat_room', confidence=detector.state_confidence)
if not chat_room_locs: if not chat_room_locs:
print("UI Thread: Not in chat room state before bubble detection. Attempting cleanup...") print("UI Thread: Not in chat room state before bubble detection. Checking for update confirm...")
# Call the existing cleanup function to try and return
# 檢查是否存在更新確認按鈕
update_confirm_locs = detector._find_template('update_confirm', confidence=0.8)
if update_confirm_locs:
print("UI Thread: Detected update_confirm button. Clicking to proceed...")
interactor.click_at(update_confirm_locs[0][0], update_confirm_locs[0][1])
time.sleep(0.5) # 給更新過程一些時間
print("UI Thread: Clicked update_confirm button. Continuing...")
continue # 重新開始循環以重新檢查狀態
# 沒有找到更新確認按鈕,繼續原有的清理邏輯
print("UI Thread: No update_confirm button found. Attempting cleanup...")
perform_state_cleanup(detector, interactor) perform_state_cleanup(detector, interactor)
# Regardless of cleanup success, restart the loop to re-evaluate state from the top # Regardless of cleanup success, restart the loop to re-evaluate state from the top
print("UI Thread: Continuing loop after attempting chat room cleanup.") print("UI Thread: Continuing loop after attempting chat room cleanup.")
@ -1916,6 +2071,13 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
print("Warning: Failed to capture bubble snapshot. Skipping this bubble.") print("Warning: Failed to capture bubble snapshot. Skipping this bubble.")
continue # Skip to next bubble continue # Skip to next bubble
# --- New: Image deduplication check ---
if bubble_deduplicator.is_duplicate(bubble_snapshot, bubble_region_tuple):
print("Detected duplicate bubble, skipping processing")
perform_state_cleanup(detector, interactor)
continue # Skip processing this bubble
# --- End of image deduplication check ---
# --- Save Snapshot for Debugging --- # --- Save Snapshot for Debugging ---
try: try:
screenshot_index = (screenshot_counter % MAX_DEBUG_SCREENSHOTS) + 1 screenshot_index = (screenshot_counter % MAX_DEBUG_SCREENSHOTS) + 1
@ -1927,7 +2089,7 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
screenshot_counter += 1 screenshot_counter += 1
except Exception as save_err: except Exception as save_err:
print(f"Error saving bubble snapshot to {screenshot_path}: {repr(save_err)}") print(f"Error saving bubble snapshot to {screenshot_path}: {repr(save_err)}")
except Exception as snapshot_err: except Exception as snapshot_err:
print(f"Error taking initial bubble snapshot: {repr(snapshot_err)}") print(f"Error taking initial bubble snapshot: {repr(snapshot_err)}")
continue # Skip to next bubble continue # Skip to next bubble
@ -1982,16 +2144,6 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
perform_state_cleanup(detector, interactor) # Attempt cleanup perform_state_cleanup(detector, interactor) # Attempt cleanup
continue # Skip to next bubble continue # Skip to next bubble
# Check recent text history
# print("[DEBUG] UI Loop: Checking recent text history...") # DEBUG REMOVED
if bubble_text in recent_texts:
print(f"Content '{bubble_text[:30]}...' in recent history, skipping this bubble.")
continue # Skip to next bubble
print(">>> New trigger event <<<")
# Add to recent texts *before* potentially long interaction
recent_texts.append(bubble_text)
# 5. Interact: Get Sender Name (uses re-location internally via retrieve_sender_name_interaction) # 5. Interact: Get Sender Name (uses re-location internally via retrieve_sender_name_interaction)
# print("[DEBUG] UI Loop: Retrieving sender name...") # DEBUG REMOVED # print("[DEBUG] UI Loop: Retrieving sender name...") # DEBUG REMOVED
sender_name = None sender_name = None
@ -2069,6 +2221,32 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
print("Error: Could not get sender name for this bubble, skipping.") print("Error: Could not get sender name for this bubble, skipping.")
continue # Skip to next bubble continue # Skip to next bubble
# --- Deduplication Check ---
# This is the new central point for deduplication and recent_texts logic
if sender_name and bubble_text: # Ensure both are valid before deduplication
if deduplicator.is_duplicate(sender_name, bubble_text):
print(f"UI Thread: Skipping duplicate message via Deduplicator: {sender_name} - {bubble_text[:30]}...")
# Cleanup UI state as interaction might have occurred during sender_name retrieval
perform_state_cleanup(detector, interactor)
continue # Skip this bubble
# If not a duplicate by deduplicator, then check recent_texts (original safeguard)
# if bubble_text in recent_texts:
# print(f"UI Thread: Content '{bubble_text[:30]}...' in recent_texts history, skipping.")
# perform_state_cleanup(detector, interactor) # Cleanup as we are skipping
# continue
# If not a duplicate by any means, add to recent_texts and proceed
print(">>> New trigger event (passed deduplication) <<<")
# recent_texts.append(bubble_text) # No longer needed with image deduplication
else:
# This case implies sender_name or bubble_text was None/empty,
# which should have been caught by earlier checks.
# If somehow reached, log and skip.
print(f"Warning: sender_name ('{sender_name}') or bubble_text ('{bubble_text[:30]}...') is invalid before deduplication check. Skipping.")
perform_state_cleanup(detector, interactor)
continue
# --- Attempt to activate reply context --- # --- Attempt to activate reply context ---
# print("[DEBUG] UI Loop: Attempting to activate reply context...") # DEBUG REMOVED # print("[DEBUG] UI Loop: Attempting to activate reply context...") # DEBUG REMOVED
reply_context_activated = False reply_context_activated = False
@ -2115,34 +2293,71 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
# 7. Send Trigger Info to Main Thread # 7. Send Trigger Info to Main Thread
print("\n>>> Putting trigger info in Queue <<<") print("\n>>> Putting trigger info in Queue <<<")
print(f" Sender: {sender_name}") try:
print(f" Content: {bubble_text[:100]}...") # 安全地處理和顯示發送者名稱
safe_sender_display = handle_text_encoding(sender_name, "[未知發送者]")
print(f" Sender: {safe_sender_display}")
# 安全地處理和顯示消息內容
if bubble_text:
display_text = bubble_text[:100] + "..." if len(bubble_text) > 100 else bubble_text
safe_content_display = handle_text_encoding(display_text, "[無法處理的文字內容]")
print(f" Content: {safe_content_display}")
else:
print(" Content: [空]")
except Exception as e_display:
print(f"Error displaying message info: {str(e_display)}")
print(f" Bubble Region: {bubble_region}") # Original region for context print(f" Bubble Region: {bubble_region}") # Original region for context
print(f" Reply Context Activated: {reply_context_activated}") print(f" Reply Context Activated: {reply_context_activated}")
try: try:
# 確保所有文字數據都經過安全處理
data_to_send = { data_to_send = {
'sender': sender_name, 'sender': handle_text_encoding(sender_name, "[未知發送者]"),
'text': bubble_text, 'text': handle_text_encoding(bubble_text, "[無法處理的文字內容]"),
'bubble_region': bubble_region, # Send original region for context if needed 'bubble_region': bubble_region,
'reply_context_activated': reply_context_activated, 'reply_context_activated': reply_context_activated,
'bubble_snapshot': bubble_snapshot, # Send the snapshot used 'bubble_snapshot': bubble_snapshot,
'search_area': search_area 'search_area': search_area
} }
trigger_queue.put(data_to_send) trigger_queue.put(data_to_send)
print("Trigger info (with region, reply flag, snapshot, search_area) placed in Queue.") print("Trigger info (with region, reply flag, snapshot, search_area) placed in Queue.")
# --- 新增:更新氣泡去重記錄中的發送者信息 ---
# 注意:我們在前面已經添加了氣泡到去重系統,但當時還沒獲取發送者名稱
# 這裡我們嘗試再次更新發送者信息(如果實現允許的話)
if 'bubble_deduplicator' in locals() and bubble_snapshot and sender_name:
bubble_id = bubble_deduplicator.generate_bubble_id(bubble_region_tuple)
if bubble_id in bubble_deduplicator.recent_bubbles:
bubble_deduplicator.recent_bubbles[bubble_id]['sender'] = sender_name
bubble_deduplicator._save_storage()
# --- 更新發送者信息結束 ---
# --- CRITICAL: Break loop after successfully processing one trigger --- # --- CRITICAL: Break loop after successfully processing one trigger ---
print("--- Single bubble processing complete. Breaking scan cycle. ---") print("--- Single bubble processing complete. Breaking scan cycle. ---")
break # Exit the 'for target_bubble_info in sorted_bubbles' loop break # Exit the 'for target_bubble_info in sorted_bubbles' loop
except Exception as q_err: except Exception as q_err:
print(f"Error putting data in Queue: {q_err}") print(f"Error preparing or enqueueing data: {q_err}")
# Don't break if queue put fails, maybe try next bubble? Or log and break? # 嘗試使用最小數據集合保證功能性
try:
minimal_data = {
'sender': "[數據處理錯誤]",
'text': handle_text_encoding(bubble_text[:100] if bubble_text else "[內容獲取失敗]"), # Apply encoding here too
'bubble_region': bubble_region,
'reply_context_activated': False, # Sensible default
'bubble_snapshot': bubble_snapshot, # Keep snapshot if available
'search_area': search_area
}
trigger_queue.put(minimal_data)
print("Minimal fallback data placed in Queue after error.")
except Exception as min_q_err:
print(f"Critical failure: Could not place any data in queue: {min_q_err}")
# Let's break here too, as something is wrong. # Let's break here too, as something is wrong.
print("Breaking scan cycle due to queue error.") print("Breaking scan cycle due to queue error.")
break break
# End of keyword found block (if keyword_coords:) # End of keyword found block (if result:)
# End of loop through sorted bubbles (for target_bubble_info...) # End of loop through sorted bubbles (for target_bubble_info...)
# If the loop finished without breaking (i.e., no trigger processed), wait the full interval. # If the loop finished without breaking (i.e., no trigger processed), wait the full interval.