Compare commits
34 Commits
Refactorin
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
2c8a9e4588 | ||
|
|
f9457bf992 | ||
|
|
a8603d4d45 | ||
|
|
e3e3d3b914 | ||
|
|
dad375dec8 | ||
|
|
2ac63718a9 | ||
|
|
0b794a4c32 | ||
|
|
677a73f026 | ||
|
|
890772f70e | ||
|
|
2836ce899d | ||
|
|
51a99ee5ad | ||
|
|
a5b6a44164 | ||
|
|
59471b62ce | ||
|
|
b33ea85768 | ||
|
|
4a03ca4424 | ||
|
|
7d9ead1c60 | ||
|
|
bccc6d413f | ||
|
|
65df12a20e | ||
|
|
2a68f04e87 | ||
|
|
4dd5d91029 | ||
|
|
48c0c25a42 | ||
|
|
ce111cf3d5 | ||
|
|
a29d336df0 | ||
|
|
6cffa4c70c | ||
|
|
90b3a492d7 | ||
|
|
42a6bde23f | ||
|
|
7e4383fa98 | ||
|
|
74270aace7 | ||
|
|
30e418eba4 | ||
|
|
5cba0b970c | ||
|
|
583600760b | ||
|
|
a9ff1959ef | ||
|
|
381b40c62f | ||
|
|
30df8f8320 |
3
.gitignore
vendored
3
.gitignore
vendored
@ -3,8 +3,11 @@
|
||||
llm_debug.log
|
||||
config.py
|
||||
config.py.bak
|
||||
simple_bubble_dedup.json
|
||||
__pycache__/
|
||||
debug_screenshots/
|
||||
chat_logs/
|
||||
backup/
|
||||
chroma_data/
|
||||
wolf_control.py
|
||||
remote_config.json
|
||||
213
ClaudeCode.md
213
ClaudeCode.md
@ -15,72 +15,66 @@ Wolf Chat 是一個基於 MCP (Modular Capability Provider) 框架的聊天機
|
||||
|
||||
### 核心元件
|
||||
|
||||
1. **主控模塊 (main.py)**
|
||||
- 協調各模塊的工作
|
||||
- 初始化 MCP 連接
|
||||
- **容錯處理**:即使 `config.py` 中未配置 MCP 伺服器,或所有伺服器連接失敗,程式現在也會繼續執行,僅打印警告訊息,MCP 功能將不可用。 (Added 2025-04-21)
|
||||
- **伺服器子進程管理 (修正 2025-05-02)**:使用 `mcp.client.stdio.stdio_client` 啟動和連接 `config.py` 中定義的每個 MCP 伺服器。`stdio_client` 作為一個異步上下文管理器,負責管理其啟動的子進程的生命週期。
|
||||
- **Windows 特定處理 (修正 2025-05-02)**:在 Windows 上,如果 `pywin32` 可用,會註冊一個控制台事件處理程序 (`win32api.SetConsoleCtrlHandler`)。此處理程序主要用於輔助觸發正常的關閉流程(最終會調用 `AsyncExitStack.aclose()`),而不是直接終止進程。伺服器子進程的實際終止依賴於 `stdio_client` 上下文管理器在 `AsyncExitStack.aclose()` 期間的清理操作。
|
||||
- **記憶體系統初始化 (新增 2025-05-02)**:在啟動時調用 `chroma_client.initialize_memory_system()`,根據 `config.py` 中的 `ENABLE_PRELOAD_PROFILES` 設定決定是否啟用記憶體預載入。
|
||||
- 設置並管理主要事件循環
|
||||
- **記憶體預載入 (新增 2025-05-02)**:在主事件循環中,如果預載入已啟用,則在每次收到 UI 觸發後、調用 LLM 之前,嘗試從 ChromaDB 預先獲取用戶資料 (`get_entity_profile`)、相關記憶 (`get_related_memories`) 和潛在相關的機器人知識 (`get_bot_knowledge`)。
|
||||
- 處理程式生命週期管理和資源清理(通過 `AsyncExitStack` 間接管理 MCP 伺服器子進程的終止)
|
||||
1. **主控模塊 (main.py)**
|
||||
- 協調各模塊的工作
|
||||
- 初始化 MCP 連接
|
||||
- **容錯處理**:即使 `config.py` 中未配置 MCP 伺服器,或所有伺服器連接失敗,程式現在也會繼續執行,僅打印警告訊息,MCP 功能將不可用。 (Added 2025-04-21)
|
||||
- **伺服器子進程管理 (修正 2025-05-02)**:使用 `mcp.client.stdio.stdio_client` 啟動和連接 `config.py` 中定義的每個 MCP 伺服器。`stdio_client` 作為一個異步上下文管理器,負責管理其啟動的子進程的生命週期。
|
||||
- **Windows 特定處理 (修正 2025-05-02)**:在 Windows 上,如果 `pywin32` 可用,會註冊一個控制台事件處理程序 (`win32api.SetConsoleCtrlHandler`)。此處理程序主要用於輔助觸發正常的關閉流程(最終會調用 `AsyncExitStack.aclose()`),而不是直接終止進程。伺服器子進程的實際終止依賴於 `stdio_client` 上下文管理器在 `AsyncExitStack.aclose()` 期間的清理操作。
|
||||
- **記憶體系統初始化 (新增 2025-05-02)**:在啟動時調用 `chroma_client.initialize_memory_system()`,根據 `config.py` 中的 `ENABLE_PRELOAD_PROFILES` 設定決定是否啟用記憶體預載入。
|
||||
- 設置並管理主要事件循環
|
||||
- **記憶體預載入 (新增 2025-05-02)**:在主事件循環中,如果預載入已啟用,則在每次收到 UI 觸發後、調用 LLM 之前,嘗試從 ChromaDB 預先獲取用戶資料 (`get_entity_profile`)、相關記憶 (`get_related_memories`) 和潛在相關的機器人知識 (`get_bot_knowledge`)。
|
||||
- 處理程式生命週期管理和資源清理(通過 `AsyncExitStack` 間接管理 MCP 伺服器子進程的終止)
|
||||
|
||||
2. **LLM 交互模塊 (llm_interaction.py)**
|
||||
- 與語言模型 API 通信
|
||||
- 管理系統提示與角色設定
|
||||
- **條件式提示 (新增 2025-05-02)**:`get_system_prompt` 函數現在接受預載入的用戶資料、相關記憶和機器人知識。根據是否有預載入數據,動態調整系統提示中的記憶體檢索協議說明。
|
||||
- 處理語言模型的工具調用功能
|
||||
- 格式化 LLM 回應
|
||||
- 提供工具結果合成機制
|
||||
2. **LLM 交互模塊 (llm_interaction.py)**
|
||||
- 與語言模型 API 通信
|
||||
- 管理系統提示與角色設定
|
||||
- **條件式提示 (新增 2025-05-02)**:`get_system_prompt` 函數現在接受預載入的用戶資料、相關記憶和機器人知識。根據是否有預載入數據,動態調整系統提示中的記憶體檢索協議說明。
|
||||
- 處理語言模型的工具調用功能
|
||||
- 格式化 LLM 回應
|
||||
- 提供工具結果合成機制
|
||||
|
||||
3. **UI 互動模塊 (ui_interaction.py)**
|
||||
- 使用圖像辨識技術監控遊戲聊天視窗
|
||||
- 檢測聊天泡泡與關鍵字
|
||||
- 複製聊天內容和獲取發送者姓名
|
||||
- 將生成的回應輸入到遊戲中
|
||||
3. **UI 互動模塊 (ui_interaction.py)**
|
||||
- 使用圖像辨識技術監控遊戲聊天視窗
|
||||
- 檢測聊天泡泡與關鍵字
|
||||
- 複製聊天內容和獲取發送者姓名
|
||||
- 將生成的回應輸入到遊戲中
|
||||
|
||||
4. **MCP 客戶端模塊 (mcp_client.py)**
|
||||
- 管理與 MCP 服務器的通信
|
||||
- 列出和調用可用工具
|
||||
- 處理工具調用的結果和錯誤
|
||||
4. **MCP 客戶端模塊 (mcp_client.py)**
|
||||
- 管理與 MCP 服務器的通信
|
||||
- 列出和調用可用工具
|
||||
- 處理工具調用的結果和錯誤
|
||||
|
||||
5. **配置模塊 (config.py)**
|
||||
- 集中管理系統參數和設定
|
||||
- 整合環境變數
|
||||
- 配置 API 密鑰和服務器設定
|
||||
5. **配置模塊 (config.py)**
|
||||
- 集中管理系統參數和設定
|
||||
- 整合環境變數
|
||||
- 配置 API 密鑰和服務器設定
|
||||
|
||||
6. **角色定義 (persona.json)**
|
||||
- 詳細定義機器人的人格特徵
|
||||
- 包含外觀、說話風格、個性特點等資訊
|
||||
- 提供給 LLM 以確保角色扮演一致性
|
||||
6. **角色定義 (persona.json)**
|
||||
- 詳細定義機器人的人格特徵
|
||||
- 包含外觀、說話風格、個性特點等資訊
|
||||
- 提供給 LLM 以確保角色扮演一致性
|
||||
|
||||
7. **遊戲視窗監控模組 (game_monitor.py)** (取代 window-setup-script.py 和舊的 window-monitor-script.py)
|
||||
- 持續監控遊戲視窗 (`config.WINDOW_TITLE`)。
|
||||
- 確保視窗維持在設定檔 (`config.py`) 中指定的位置 (`GAME_WINDOW_X`, `GAME_WINDOW_Y`) 和大小 (`GAME_WINDOW_WIDTH`, `GAME_WINDOW_HEIGHT`)。
|
||||
- 確保視窗維持在最上層 (Always on Top)。
|
||||
- **定時遊戲重啟** (如果 `config.ENABLE_SCHEDULED_RESTART` 為 True):
|
||||
- 根據 `config.RESTART_INTERVAL_MINUTES` 設定的間隔執行。
|
||||
- **簡化流程 (2025-04-25)**:
|
||||
1. 通過 `stdout` 向 `main.py` 發送 JSON 訊號 (`{'action': 'pause_ui'}`),請求暫停 UI 監控。
|
||||
2. 等待固定時間(30 秒)。
|
||||
3. 調用 `restart_game_process` 函數,**嘗試**終止 (`terminate`/`kill`) `LastWar.exe` 進程(**無驗證**)。
|
||||
4. 等待固定時間(2 秒)。
|
||||
5. **嘗試**使用 `os.startfile` 啟動 `config.GAME_EXECUTABLE_PATH`(**無驗證**)。
|
||||
6. 等待固定時間(30 秒)。
|
||||
7. 使用 `try...finally` 結構確保**總是**執行下一步。
|
||||
8. 通過 `stdout` 向 `main.py` 發送 JSON 訊號 (`{'action': 'resume_ui'}`),請求恢復 UI 監控。
|
||||
- **視窗調整**:遊戲視窗的位置/大小/置頂狀態的調整完全由 `monitor_game_window` 的主循環持續負責,重啟流程不再進行立即調整。
|
||||
- **作為獨立進程運行**:由 `main.py` 使用 `subprocess.Popen` 啟動,捕獲其 `stdout` (用於 JSON 訊號) 和 `stderr` (用於日誌)。
|
||||
- **進程間通信**:
|
||||
- `game_monitor.py` -> `main.py`:通過 `stdout` 發送 JSON 格式的 `pause_ui` 和 `resume_ui` 訊號。
|
||||
- **日誌處理**:`game_monitor.py` 的日誌被配置為輸出到 `stderr`,以保持 `stdout` 清潔,確保訊號傳遞可靠性。`main.py` 會讀取 `stderr` 並可能顯示這些日誌。
|
||||
- **生命週期管理**:由 `main.py` 在啟動時創建,並在 `shutdown` 過程中嘗試終止 (`terminate`)。
|
||||
7. **遊戲管理器模組 (game_manager.py)** (取代舊的 `game_monitor.py`)
|
||||
- **核心類 `GameMonitor`**:封裝所有遊戲視窗監控、自動重啟和進程管理功能。
|
||||
- **由 `Setup.py` 管理**:
|
||||
- 在 `Setup.py` 的 "Start Managed Bot & Game" 流程中被實例化和啟動。
|
||||
- 在停止會話時由 `Setup.py` 停止。
|
||||
- 設定(如視窗標題、路徑、重啟間隔等)通過 `Setup.py` 傳遞,並可在運行時通過 `update_config` 方法更新。
|
||||
- **功能**:
|
||||
- 持續監控遊戲視窗 (`config.WINDOW_TITLE`)。
|
||||
- 確保視窗維持在設定檔中指定的位置和大小。
|
||||
- 確保視窗保持活躍(帶到前景並獲得焦點)。
|
||||
- **定時遊戲重啟**:根據設定檔中的間隔執行。
|
||||
- **回調機制**:重啟完成後,通過回調函數通知 `Setup.py`(例如,`restart_complete`),`Setup.py` 隨後處理機器人重啟。
|
||||
- **進程管理**:使用 `psutil`(如果可用)查找和終止遊戲進程。
|
||||
- **跨平台啟動**:使用 `os.startfile` (Windows) 或 `subprocess.Popen` (其他平台) 啟動遊戲。
|
||||
- **獨立運行模式**:`game_manager.py` 仍然可以作為獨立腳本運行 (類似舊的 `game_monitor.py`),此時它會從 `config.py` 加載設定,並通過 `stdout` 發送 JSON 訊號。
|
||||
|
||||
8. **ChromaDB 客戶端模塊 (chroma_client.py)** (新增 2025-05-02)
|
||||
- 處理與本地 ChromaDB 向量數據庫的連接和互動。
|
||||
- 提供函數以初始化客戶端、獲取/創建集合,以及查詢用戶資料、相關記憶和機器人知識。
|
||||
- 使用 `chromadb.PersistentClient` 連接持久化數據庫。
|
||||
8. **ChromaDB 客戶端模塊 (chroma_client.py)** (新增 2025-05-02)
|
||||
- 處理與本地 ChromaDB 向量數據庫的連接和互動。
|
||||
- 提供函數以初始化客戶端、獲取/創建集合,以及查詢用戶資料、相關記憶和機器人知識。
|
||||
- 使用 `chromadb.PersistentClient` 連接持久化數據庫。
|
||||
|
||||
### 資料流程
|
||||
|
||||
@ -130,7 +124,14 @@ Wolf Chat 是一個基於 MCP (Modular Capability Provider) 框架的聊天機
|
||||
* **計算頭像座標**:根據**新**找到的氣泡左上角座標,應用特定偏移量 (`AVATAR_OFFSET_X_REPLY`, `AVATAR_OFFSET_Y_REPLY`) 計算頭像點擊位置。
|
||||
* **互動(含重試)**:點擊計算出的頭像位置,檢查是否成功進入個人資料頁面 (`Profile_page.png`)。若失敗,最多重試 3 次(每次重試前會再次重新定位氣泡)。若成功,則繼續導航菜單複製用戶名稱。
|
||||
* **原始偏移量**:原始的 `-55` 像素水平偏移量 (`AVATAR_OFFSET_X`) 仍保留,用於 `remove_user_position` 等其他功能。
|
||||
5. **防重複處理 (Duplicate Prevention)**:使用最近處理過的文字內容歷史 (`recent_texts`) 防止對相同訊息重複觸發。
|
||||
5. **防重複處理 (Duplicate Prevention)**:
|
||||
* **基於圖像哈希的去重 (Image Hash Deduplication)**: 新增 `simple_bubble_dedup.py` 模塊,實現基於圖像感知哈希 (Perceptual Hash) 的去重系統。
|
||||
* **原理**: 系統會計算最近處理過的氣泡圖像的感知哈希值,並保存最近的 N 個 (預設 5 個) 氣泡的哈希。當偵測到新氣泡時,會計算其哈希並與保存的哈希進行比對。如果哈希差異小於設定的閾值 (預設 5),則認為是重複氣泡並跳過處理。
|
||||
* **實現**: 在 `ui_interaction.py` 的 `run_ui_monitoring_loop` 函數中初始化 `SimpleBubbleDeduplication` 實例,並在偵測到關鍵字並截取氣泡快照後,調用 `is_duplicate` 方法進行檢查。
|
||||
* **狀態管理**: 使用 `simple_bubble_dedup.json` 文件持久化保存最近的氣泡哈希記錄。
|
||||
* **清理**: F7 (`clear_history`) 和 F8 (`reset_state`) 功能已擴展,會同時清除圖像去重系統中的記錄。
|
||||
* **發送者信息更新**: 在成功處理並將氣泡信息放入隊列後,會嘗試更新去重記錄中對應氣泡的發送者名稱。
|
||||
* **文字內容歷史 (已棄用)**: 原有的基於 `recent_texts` 的文字內容重複檢查邏輯已**移除或註解**,圖像哈希去重成為主要的去重機制。
|
||||
|
||||
#### LLM 整合
|
||||
|
||||
@ -598,6 +599,22 @@ Wolf Chat 是一個基於 MCP (Modular Capability Provider) 框架的聊天機
|
||||
- **依賴項**:Windows 上的控制台事件處理仍然依賴 `pywin32` 套件。如果未安裝,程式會打印警告,關閉時的可靠性可能略有降低(但 `stdio_client` 的正常清理機制應在多數情況下仍然有效)。
|
||||
- **效果**:恢復了與 `mcp` 庫的兼容性,同時通過標準的上下文管理和輔助性的 Windows 事件處理,實現了在主程式退出時關閉 MCP 伺服器子進程的目標。
|
||||
|
||||
## 最近改進(2025-05-12)
|
||||
|
||||
### 遊戲視窗置頂邏輯修改
|
||||
|
||||
- **目的**:將 `game_monitor.py` 中強制遊戲視窗「永遠在最上層」(Always on Top) 的行為,修改為「臨時置頂並獲得焦點」(Bring to Foreground/Activate),以解決原方法僅覆蓋其他視窗的問題。
|
||||
- **`game_monitor.py`**:
|
||||
- 在 `monitor_game_window` 函數的監控循環中,移除了使用 `win32gui.SetWindowPos` 和 `win32con.HWND_TOPMOST` 來檢查和設定 `WS_EX_TOPMOST` 樣式的程式碼。
|
||||
- 替換為檢查當前前景視窗 (`win32gui.GetForegroundWindow()`) 是否為目標遊戲視窗 (`hwnd`)。
|
||||
- 如果不是,則嘗試以下步驟將視窗帶到前景並獲得焦點:
|
||||
1. 使用 `win32gui.SetWindowPos` 搭配 `win32con.HWND_TOP` 旗標,將視窗提升到所有非最上層視窗之上。
|
||||
2. 呼叫 `win32gui.SetForegroundWindow(hwnd)` 嘗試將視窗設為前景並獲得焦點。
|
||||
3. 短暫延遲後,檢查視窗是否成功成為前景視窗。
|
||||
4. 如果 `SetForegroundWindow` 未成功,則嘗試使用 `pygetwindow` 庫提供的 `window.activate()` 方法作為備用方案。
|
||||
- 更新了相關的日誌訊息以反映新的行為和備用邏輯。
|
||||
- **效果**:監控腳本現在會使用更全面的方法嘗試將失去焦點的遊戲視窗重新激活並帶到前景,包括備用方案,以提高在不同 Windows 環境下獲取焦點的成功率。這取代了之前僅強制視覺覆蓋的行為。
|
||||
|
||||
## 開發建議
|
||||
|
||||
### 優化方向
|
||||
@ -622,6 +639,43 @@ Wolf Chat 是一個基於 MCP (Modular Capability Provider) 框架的聊天機
|
||||
- 添加主題識別與記憶功能
|
||||
- 探索多輪對話中的上下文理解能力
|
||||
|
||||
## 最近改進(2025-05-13)
|
||||
|
||||
### 遊戲監控模組重構
|
||||
|
||||
- **目的**:將遊戲監控功能從獨立的 `game_monitor.py` 腳本重構為一個更健壯、更易於管理的 `game_manager.py` 模組,並由 `Setup.py` 統一控制其生命週期和配置。
|
||||
- **`game_manager.py` (新模組)**:
|
||||
- 創建了 `GameMonitor` 類,封裝了所有遊戲視窗監控、自動重啟和進程管理邏輯。
|
||||
- 提供了 `create_game_monitor` 工廠函數。
|
||||
- 支持通過構造函數和 `update_config` 方法進行配置。
|
||||
- 使用回調函數 (`callback`) 與調用者(即 `Setup.py`)通信,例如在遊戲重啟完成時。
|
||||
- 保留了獨立運行模式,以便在直接執行時仍能工作(主要用於測試或舊版兼容)。
|
||||
- 程式碼註解和日誌訊息已更新為英文。
|
||||
- **新增遊戲崩潰自動恢復 (2025-05-15)**:
|
||||
- 在 `_monitor_loop` 方法中,優先檢查遊戲進程 (`_is_game_running`) 是否仍在運行。
|
||||
- 如果進程消失,會記錄警告並嘗試重新啟動遊戲 (`_start_game_process`)。
|
||||
- 新增 `_is_game_running` 方法,使用 `psutil` 檢查具有指定進程名稱的遊戲是否正在運行。
|
||||
- **`Setup.py` (修改)**:
|
||||
- 導入 `game_manager`。
|
||||
- 在 `WolfChatSetup` 類的 `__init__` 方法中初始化 `self.game_monitor = None`。
|
||||
- 在 `start_managed_session` 方法中:
|
||||
- 創建 `game_monitor_callback` 函數以處理來自 `GameMonitor` 的動作(特別是 `restart_complete`)。
|
||||
- 使用 `game_manager.create_game_monitor` 創建 `GameMonitor` 實例。
|
||||
- 啟動 `GameMonitor`。
|
||||
- 新增 `_handle_game_restart_complete` 方法,用於在收到 `GameMonitor` 的重啟完成回調後,處理機器人的重啟。
|
||||
- 在 `stop_managed_session` 方法中,調用 `self.game_monitor.stop()` 並釋放實例。
|
||||
- 修改 `_restart_game_managed` 方法,使其在 `self.game_monitor` 存在且運行時,調用 `self.game_monitor.restart_now()` 來執行遊戲重啟。
|
||||
- 在 `save_settings` 方法中,如果 `self.game_monitor` 實例存在,則調用其 `update_config` 方法以更新運行時配置。
|
||||
- **`main.py` (修改)**:
|
||||
- 移除了所有對舊 `game_monitor.py` 的導入、子進程啟動、訊號讀取和生命週期管理相關的程式碼。遊戲監控現在完全由 `Setup.py` 在受管會話模式下處理。
|
||||
- **舊檔案刪除**:
|
||||
- 刪除了原來的 `game_monitor.py` 文件。
|
||||
- **效果**:
|
||||
- 遊戲監控邏輯更加內聚和模塊化。
|
||||
- `Setup.py` 現在完全控制遊戲監控的啟動、停止和配置,簡化了 `main.py` 的職責。
|
||||
- 通過回調機制實現了更清晰的模塊間通信。
|
||||
- 提高了程式碼的可維護性和可擴展性。
|
||||
|
||||
### 注意事項
|
||||
|
||||
1. **圖像模板**:確保所有必要的 UI 元素模板都已截圖並放置在 templates 目錄
|
||||
@ -725,3 +779,42 @@ ClaudeCode.md
|
||||
# Current Mode
|
||||
ACT MODE
|
||||
</environment_details>
|
||||
|
||||
</file_content>
|
||||
|
||||
Now that you have the latest state of the file, try the operation again with fewer, more precise SEARCH blocks. For large files especially, it may be prudent to try to limit yourself to <5 SEARCH/REPLACE blocks at a time, then wait for the user to respond with the result of the operation before following up with another replace_in_file call to make additional edits.
|
||||
(If you run into this error 3 times in a row, you may use the write_to_file tool as a fallback.)
|
||||
</error><environment_details>
|
||||
# VSCode Visible Files
|
||||
ClaudeCode.md
|
||||
|
||||
# VSCode Open Tabs
|
||||
config_template.py
|
||||
test/llm_debug_script.py
|
||||
llm_interaction.py
|
||||
wolf_control.py
|
||||
.gitignore
|
||||
chroma_client.py
|
||||
batch_memory_record.py
|
||||
memory_manager.py
|
||||
game_monitor.py
|
||||
game_manager.py
|
||||
Setup.py
|
||||
main.py
|
||||
ClaudeCode.md
|
||||
reembedding tool.py
|
||||
config.py
|
||||
memory_backup.py
|
||||
tools/chroma_view.py
|
||||
ui_interaction.py
|
||||
remote_config.json
|
||||
|
||||
# Current Time
|
||||
5/13/2025, 3:31:34 AM (Asia/Taipei, UTC+8:00)
|
||||
|
||||
# Context Window Usage
|
||||
429,724 / 1,048.576K tokens used (41%)
|
||||
|
||||
# Current Mode
|
||||
ACT MODE
|
||||
</environment_details>
|
||||
|
||||
208
batch_memory_record.py
Normal file
208
batch_memory_record.py
Normal file
@ -0,0 +1,208 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
Wolf Chat 批次記憶備份工具
|
||||
|
||||
自動掃描chat_logs資料夾,針對所有日誌檔案執行記憶備份
|
||||
"""
|
||||
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
import time
|
||||
import argparse
|
||||
import subprocess
|
||||
import logging
|
||||
from datetime import datetime
|
||||
from typing import List, Optional, Tuple
|
||||
|
||||
# 設置日誌
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
|
||||
handlers=[
|
||||
logging.FileHandler("batch_backup.log"),
|
||||
logging.StreamHandler()
|
||||
]
|
||||
)
|
||||
logger = logging.getLogger("BatchMemoryBackup")
|
||||
|
||||
def find_log_files(log_dir: str = "chat_logs") -> List[Tuple[str, str]]:
|
||||
"""
|
||||
掃描指定目錄,找出所有符合YYYY-MM-DD.log格式的日誌文件
|
||||
|
||||
返回: [(日期字符串, 文件路徑), ...],按日期排序
|
||||
"""
|
||||
date_pattern = re.compile(r'^(\d{4}-\d{2}-\d{2})\.log$')
|
||||
log_files = []
|
||||
|
||||
# 確保目錄存在
|
||||
if not os.path.exists(log_dir) or not os.path.isdir(log_dir):
|
||||
logger.error(f"目錄不存在或不是有效目錄: {log_dir}")
|
||||
return []
|
||||
|
||||
# 掃描目錄
|
||||
for filename in os.listdir(log_dir):
|
||||
match = date_pattern.match(filename)
|
||||
if match:
|
||||
date_str = match.group(1)
|
||||
file_path = os.path.join(log_dir, filename)
|
||||
try:
|
||||
# 驗證日期格式
|
||||
datetime.strptime(date_str, "%Y-%m-%d")
|
||||
log_files.append((date_str, file_path))
|
||||
except ValueError:
|
||||
logger.warning(f"發現無效的日期格式: {filename}")
|
||||
|
||||
# 按日期排序
|
||||
log_files.sort(key=lambda x: x[0])
|
||||
return log_files
|
||||
|
||||
def process_log_file(date_str: str, backup_script: str = "memory_backup.py") -> bool:
|
||||
"""
|
||||
為指定日期的日誌文件執行記憶備份
|
||||
|
||||
Parameters:
|
||||
date_str: 日期字符串,格式為YYYY-MM-DD
|
||||
backup_script: 備份腳本路徑
|
||||
|
||||
Returns:
|
||||
bool: 操作是否成功
|
||||
"""
|
||||
logger.info(f"開始處理日期 {date_str} 的日誌")
|
||||
|
||||
try:
|
||||
# 構建命令
|
||||
cmd = [sys.executable, backup_script, "--backup", "--date", date_str]
|
||||
|
||||
# 執行命令
|
||||
logger.info(f"執行命令: {' '.join(cmd)}")
|
||||
process = subprocess.run(
|
||||
cmd,
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.PIPE,
|
||||
text=True,
|
||||
check=False # 不要在命令失敗時拋出異常
|
||||
)
|
||||
|
||||
# 檢查結果
|
||||
if process.returncode == 0:
|
||||
logger.info(f"日期 {date_str} 的處理完成")
|
||||
return True
|
||||
else:
|
||||
logger.error(f"處理日期 {date_str} 失敗: {process.stderr}")
|
||||
return False
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"處理日期 {date_str} 時發生異常: {str(e)}")
|
||||
return False
|
||||
|
||||
def batch_process(log_dir: str = "chat_logs", backup_script: str = "memory_backup.py",
|
||||
date_range: Optional[Tuple[str, str]] = None,
|
||||
wait_seconds: int = 5) -> Tuple[int, int]:
|
||||
"""
|
||||
批次處理多個日誌文件
|
||||
|
||||
Parameters:
|
||||
log_dir: 日誌目錄路徑
|
||||
backup_script: 備份腳本路徑
|
||||
date_range: (開始日期, 結束日期),用於限制處理範圍,格式為YYYY-MM-DD
|
||||
wait_seconds: 每個文件處理後的等待時間(秒)
|
||||
|
||||
Returns:
|
||||
(成功數量, 總數量)
|
||||
"""
|
||||
log_files = find_log_files(log_dir)
|
||||
|
||||
if not log_files:
|
||||
logger.warning(f"在 {log_dir} 中未找到有效的日誌文件")
|
||||
return (0, 0)
|
||||
|
||||
logger.info(f"找到 {len(log_files)} 個日誌文件")
|
||||
|
||||
# 如果指定了日期範圍,過濾文件
|
||||
if date_range:
|
||||
start_date, end_date = date_range
|
||||
filtered_files = [(date_str, path) for date_str, path in log_files
|
||||
if start_date <= date_str <= end_date]
|
||||
logger.info(f"根據日期範圍 {start_date} 到 {end_date} 過濾後剩餘 {len(filtered_files)} 個文件")
|
||||
log_files = filtered_files
|
||||
|
||||
success_count = 0
|
||||
total_count = len(log_files)
|
||||
|
||||
for i, (date_str, file_path) in enumerate(log_files):
|
||||
logger.info(f"處理進度: {i+1}/{total_count} - 日期: {date_str}")
|
||||
|
||||
if process_log_file(date_str, backup_script):
|
||||
success_count += 1
|
||||
|
||||
# 若不是最後一個文件,等待一段時間再處理下一個
|
||||
if i < total_count - 1:
|
||||
logger.info(f"等待 {wait_seconds} 秒後處理下一個文件...")
|
||||
time.sleep(wait_seconds)
|
||||
|
||||
return (success_count, total_count)
|
||||
|
||||
def parse_date_arg(date_arg: str) -> Optional[str]:
|
||||
"""解析日期參數,確保格式為YYYY-MM-DD"""
|
||||
if not date_arg:
|
||||
return None
|
||||
|
||||
try:
|
||||
parsed_date = datetime.strptime(date_arg, "%Y-%m-%d")
|
||||
return parsed_date.strftime("%Y-%m-%d")
|
||||
except ValueError:
|
||||
logger.error(f"無效的日期格式: {date_arg},請使用YYYY-MM-DD格式")
|
||||
return None
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description='Wolf Chat 批次記憶備份工具')
|
||||
parser.add_argument('--log-dir', default='chat_logs', help='日誌檔案目錄,預設為 chat_logs')
|
||||
parser.add_argument('--script', default='memory_backup.py', help='記憶備份腳本路徑,預設為 memory_backup.py')
|
||||
parser.add_argument('--start-date', help='開始日期(含),格式為 YYYY-MM-DD')
|
||||
parser.add_argument('--end-date', help='結束日期(含),格式為 YYYY-MM-DD')
|
||||
parser.add_argument('--wait', type=int, default=5, help='每個檔案處理間隔時間(秒),預設為 5 秒')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# 驗證日期參數
|
||||
start_date = parse_date_arg(args.start_date)
|
||||
end_date = parse_date_arg(args.end_date)
|
||||
|
||||
# 如果只有一個日期參數,將兩個都設為該日期(僅處理該日期)
|
||||
if start_date and not end_date:
|
||||
end_date = start_date
|
||||
elif end_date and not start_date:
|
||||
start_date = end_date
|
||||
|
||||
date_range = (start_date, end_date) if start_date and end_date else None
|
||||
|
||||
logger.info("開始批次記憶備份流程")
|
||||
logger.info(f"日誌目錄: {args.log_dir}")
|
||||
logger.info(f"備份腳本: {args.script}")
|
||||
if date_range:
|
||||
logger.info(f"日期範圍: {date_range[0]} 到 {date_range[1]}")
|
||||
else:
|
||||
logger.info("處理所有找到的日誌檔案")
|
||||
logger.info(f"等待間隔: {args.wait} 秒")
|
||||
|
||||
start_time = time.time()
|
||||
success, total = batch_process(
|
||||
log_dir=args.log_dir,
|
||||
backup_script=args.script,
|
||||
date_range=date_range,
|
||||
wait_seconds=args.wait
|
||||
)
|
||||
end_time = time.time()
|
||||
|
||||
duration = end_time - start_time
|
||||
logger.info(f"批次處理完成。成功: {success}/{total},耗時: {duration:.2f} 秒")
|
||||
|
||||
if success < total:
|
||||
logger.warning("部分日誌檔案處理失敗,請查看日誌瞭解詳情")
|
||||
return 1
|
||||
return 0
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
@ -47,6 +47,14 @@
|
||||
"hsv_upper": [107, 255, 255],
|
||||
"min_area": 2500,
|
||||
"max_area": 300000
|
||||
},
|
||||
{
|
||||
"name": "easter",
|
||||
"is_bot": false,
|
||||
"hsv_lower": [5, 154, 183],
|
||||
"hsv_upper": [29, 255, 255],
|
||||
"min_area": 2500,
|
||||
"max_area": 300000
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
@ -1,6 +1,7 @@
|
||||
# chroma_client.py
|
||||
import chromadb
|
||||
from chromadb.config import Settings
|
||||
from chromadb.utils import embedding_functions # New import
|
||||
import os
|
||||
import json
|
||||
import config
|
||||
@ -10,6 +11,33 @@ import time
|
||||
_client = None
|
||||
_collections = {}
|
||||
|
||||
# Global embedding function variable
|
||||
_embedding_function = None
|
||||
|
||||
def get_embedding_function():
|
||||
"""Gets or creates the embedding function based on config"""
|
||||
global _embedding_function
|
||||
if _embedding_function is None:
|
||||
# Default to paraphrase-multilingual-mpnet-base-v2 if not specified or on error
|
||||
model_name = getattr(config, 'EMBEDDING_MODEL_NAME', "sentence-transformers/paraphrase-multilingual-mpnet-base-v2")
|
||||
try:
|
||||
_embedding_function = embedding_functions.SentenceTransformerEmbeddingFunction(model_name=model_name)
|
||||
print(f"Successfully initialized embedding function with model: {model_name}")
|
||||
except Exception as e:
|
||||
print(f"Failed to initialize embedding function with model '{model_name}': {e}")
|
||||
# Fallback to default if specified model fails and it's not already the default
|
||||
if model_name != "sentence-transformers/paraphrase-multilingual-mpnet-base-v2":
|
||||
print("Falling back to default embedding model: sentence-transformers/paraphrase-multilingual-mpnet-base-v2")
|
||||
try:
|
||||
_embedding_function = embedding_functions.SentenceTransformerEmbeddingFunction(model_name="sentence-transformers/paraphrase-multilingual-mpnet-base-v2")
|
||||
print(f"Successfully initialized embedding function with default model.")
|
||||
except Exception as e_default:
|
||||
print(f"Failed to initialize default embedding function: {e_default}")
|
||||
_embedding_function = None # Ensure it's None if all attempts fail
|
||||
else:
|
||||
_embedding_function = None # Ensure it's None if default model also fails
|
||||
return _embedding_function
|
||||
|
||||
def initialize_chroma_client():
|
||||
"""Initializes and connects to ChromaDB"""
|
||||
global _client
|
||||
@ -34,13 +62,31 @@ def get_collection(collection_name):
|
||||
|
||||
if collection_name not in _collections:
|
||||
try:
|
||||
emb_func = get_embedding_function()
|
||||
if emb_func is None:
|
||||
print(f"Failed to get or create collection '{collection_name}' due to embedding function initialization failure.")
|
||||
return None
|
||||
|
||||
_collections[collection_name] = _client.get_or_create_collection(
|
||||
name=collection_name
|
||||
name=collection_name,
|
||||
embedding_function=emb_func
|
||||
)
|
||||
print(f"Successfully got or created collection '{collection_name}'")
|
||||
print(f"Successfully got or created collection '{collection_name}' using configured embedding function.")
|
||||
except Exception as e:
|
||||
print(f"Failed to get collection '{collection_name}': {e}")
|
||||
return None
|
||||
print(f"Failed to get collection '{collection_name}' with configured embedding function: {e}")
|
||||
# Attempt to create collection with default embedding function as a fallback
|
||||
print(f"Attempting to create collection '{collection_name}' with default embedding function...")
|
||||
try:
|
||||
# Ensure we try the absolute default if the configured one (even if it was the default) failed
|
||||
default_emb_func = embedding_functions.SentenceTransformerEmbeddingFunction(model_name="sentence-transformers/paraphrase-multilingual-mpnet-base-v2")
|
||||
_collections[collection_name] = _client.get_or_create_collection(
|
||||
name=collection_name,
|
||||
embedding_function=default_emb_func
|
||||
)
|
||||
print(f"Successfully got or created collection '{collection_name}' with default embedding function after initial failure.")
|
||||
except Exception as e_default:
|
||||
print(f"Failed to get collection '{collection_name}' even with default embedding function: {e_default}")
|
||||
return None
|
||||
|
||||
return _collections[collection_name]
|
||||
|
||||
|
||||
664
game_manager.py
Normal file
664
game_manager.py
Normal file
@ -0,0 +1,664 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
Game Manager Module
|
||||
|
||||
Provides game window monitoring, automatic restart, and process management features.
|
||||
Designed to be imported and controlled by setup.py or other management scripts.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
import json
|
||||
import threading
|
||||
import subprocess
|
||||
import logging
|
||||
import pygetwindow as gw
|
||||
|
||||
# Attempt to import platform-specific modules that might be needed
|
||||
try:
|
||||
import win32gui
|
||||
import win32con
|
||||
HAS_WIN32 = True
|
||||
except ImportError:
|
||||
HAS_WIN32 = False
|
||||
print("Warning: win32gui/win32con modules not installed, some window management features may be unavailable")
|
||||
|
||||
try:
|
||||
import psutil
|
||||
HAS_PSUTIL = True
|
||||
except ImportError:
|
||||
HAS_PSUTIL = False
|
||||
print("Warning: psutil module not installed, process management features may be unavailable")
|
||||
|
||||
|
||||
class GameMonitor:
|
||||
"""
|
||||
Game window monitoring class.
|
||||
Responsible for monitoring game window position, scheduled restarts, and providing window management functions.
|
||||
"""
|
||||
def __init__(self, config_data, remote_data=None, logger=None, callback=None):
|
||||
# Use the provided logger or create a new one
|
||||
self.logger = logger or logging.getLogger("GameMonitor")
|
||||
if not self.logger.handlers:
|
||||
handler = logging.StreamHandler()
|
||||
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
|
||||
handler.setFormatter(formatter)
|
||||
self.logger.addHandler(handler)
|
||||
self.logger.setLevel(logging.INFO)
|
||||
|
||||
self.config_data = config_data
|
||||
self.remote_data = remote_data or {}
|
||||
self.callback = callback # Callback function to notify the caller
|
||||
|
||||
# Read settings from configuration
|
||||
self.window_title = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("WINDOW_TITLE", "Last War-Survival Game")
|
||||
self.enable_restart = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("ENABLE_SCHEDULED_RESTART", True)
|
||||
self.restart_interval = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("RESTART_INTERVAL_MINUTES", 60)
|
||||
self.game_path = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_EXECUTABLE_PATH", "")
|
||||
self.window_x = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_X", 50)
|
||||
self.window_y = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_Y", 30)
|
||||
self.window_width = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_WIDTH", 600)
|
||||
self.window_height = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_HEIGHT", 1070)
|
||||
self.monitor_interval = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("MONITOR_INTERVAL_SECONDS", 5)
|
||||
|
||||
# Read game process name from remote_data, use default if not found
|
||||
self.game_process_name = self.remote_data.get("GAME_PROCESS_NAME", "LastWar.exe")
|
||||
|
||||
# Internal state
|
||||
self.running = False
|
||||
self.next_restart_time = None
|
||||
self.monitor_thread = None
|
||||
self.stop_event = threading.Event()
|
||||
|
||||
# Add these tracking variables
|
||||
self.last_focus_failure_count = 0
|
||||
self.last_successful_foreground = time.time()
|
||||
|
||||
self.logger.info(f"GameMonitor initialized. Game window: '{self.window_title}', Process: '{self.game_process_name}'")
|
||||
self.logger.info(f"Position: ({self.window_x}, {self.window_y}), Size: {self.window_width}x{self.window_height}")
|
||||
self.logger.info(f"Scheduled Restart: {'Enabled' if self.enable_restart else 'Disabled'}, Interval: {self.restart_interval} minutes")
|
||||
|
||||
def start(self):
|
||||
"""Start game window monitoring"""
|
||||
if self.running:
|
||||
self.logger.info("Game window monitoring is already running")
|
||||
return True # Return True if already running
|
||||
|
||||
self.logger.info("Starting game window monitoring...")
|
||||
self.stop_event.clear()
|
||||
|
||||
# Set next restart time
|
||||
if self.enable_restart and self.restart_interval > 0:
|
||||
self.next_restart_time = time.time() + (self.restart_interval * 60)
|
||||
self.logger.info(f"Scheduled restart enabled. First restart in {self.restart_interval} minutes")
|
||||
else:
|
||||
self.next_restart_time = None
|
||||
self.logger.info("Scheduled restart is disabled")
|
||||
|
||||
# Start monitoring thread
|
||||
self.monitor_thread = threading.Thread(target=self._monitor_loop, daemon=True)
|
||||
self.monitor_thread.start()
|
||||
self.running = True
|
||||
self.logger.info("Game window monitoring started")
|
||||
return True
|
||||
|
||||
def stop(self):
|
||||
"""Stop game window monitoring"""
|
||||
if not self.running:
|
||||
self.logger.info("Game window monitoring is not running")
|
||||
return True # Return True if already stopped
|
||||
|
||||
self.logger.info("Stopping game window monitoring...")
|
||||
self.stop_event.set()
|
||||
|
||||
# Wait for monitoring thread to finish
|
||||
if self.monitor_thread and self.monitor_thread.is_alive():
|
||||
self.logger.info("Waiting for monitoring thread to finish...")
|
||||
self.monitor_thread.join(timeout=5)
|
||||
if self.monitor_thread.is_alive():
|
||||
self.logger.warning("Game window monitoring thread did not stop within the timeout period")
|
||||
|
||||
self.running = False
|
||||
self.monitor_thread = None
|
||||
self.logger.info("Game window monitoring stopped")
|
||||
return True
|
||||
|
||||
def _monitor_loop(self):
|
||||
"""Main monitoring loop"""
|
||||
self.logger.info("Game window monitoring loop started")
|
||||
last_adjustment_message = "" # Avoid logging repetitive adjustment messages
|
||||
|
||||
while not self.stop_event.is_set():
|
||||
try:
|
||||
# Add to _monitor_loop method - just 7 lines that matter
|
||||
if not self._is_game_running():
|
||||
self.logger.warning("Game process disappeared - restarting")
|
||||
time.sleep(2) # Let resources release
|
||||
if self._start_game_process():
|
||||
self.logger.info("Game restarted successfully")
|
||||
else:
|
||||
self.logger.error("Game restart failed")
|
||||
time.sleep(self.monitor_interval) # Wait before next check after a restart attempt
|
||||
continue
|
||||
|
||||
# Check for scheduled restart
|
||||
if self.next_restart_time and time.time() >= self.next_restart_time:
|
||||
self.logger.info("Scheduled restart time reached. Performing restart...")
|
||||
self._perform_restart()
|
||||
# Reset next restart time
|
||||
self.next_restart_time = time.time() + (self.restart_interval * 60)
|
||||
self.logger.info(f"Restart timer reset. Next restart in {self.restart_interval} minutes")
|
||||
# Continue to next loop iteration
|
||||
time.sleep(self.monitor_interval)
|
||||
continue
|
||||
|
||||
# Find game window
|
||||
window = self._find_game_window()
|
||||
adjustment_made = False
|
||||
current_message = ""
|
||||
|
||||
if window:
|
||||
try:
|
||||
# Use win32gui functions only on Windows
|
||||
if HAS_WIN32:
|
||||
# Get window handle
|
||||
hwnd = window._hWnd
|
||||
|
||||
# 1. Check and adjust position/size
|
||||
current_pos = (window.left, window.top)
|
||||
current_size = (window.width, window.height)
|
||||
target_pos = (self.window_x, self.window_y)
|
||||
target_size = (self.window_width, self.window_height)
|
||||
|
||||
if current_pos != target_pos or current_size != target_size:
|
||||
window.moveTo(target_pos[0], target_pos[1])
|
||||
window.resizeTo(target_size[0], target_size[1])
|
||||
time.sleep(0.1)
|
||||
window.activate()
|
||||
time.sleep(0.1)
|
||||
# Check if changes were successful
|
||||
new_pos = (window.left, window.top)
|
||||
new_size = (window.width, window.height)
|
||||
if new_pos == target_pos and new_size == target_size:
|
||||
current_message += f"Adjusted window position/size. "
|
||||
adjustment_made = True
|
||||
|
||||
# 2. Check and bring to foreground using enhanced method
|
||||
current_foreground_hwnd = win32gui.GetForegroundWindow()
|
||||
if current_foreground_hwnd != hwnd:
|
||||
# Use enhanced forceful focus method
|
||||
success, method_used = self._force_window_foreground(hwnd, window)
|
||||
if success:
|
||||
current_message += f"Focused window using {method_used}. "
|
||||
adjustment_made = True
|
||||
if not hasattr(self, 'last_focus_failure_count'):
|
||||
self.last_focus_failure_count = 0
|
||||
self.last_focus_failure_count = 0
|
||||
else:
|
||||
# Increment failure counter
|
||||
if not hasattr(self, 'last_focus_failure_count'):
|
||||
self.last_focus_failure_count = 0
|
||||
self.last_focus_failure_count += 1
|
||||
|
||||
# Log warning with consecutive failure count
|
||||
self.logger.warning(f"Window focus failed (attempt {self.last_focus_failure_count}): {method_used}")
|
||||
|
||||
# Restart game after too many failures
|
||||
if self.last_focus_failure_count >= 15:
|
||||
self.logger.warning("Excessive focus failures, restarting game...")
|
||||
self._perform_restart()
|
||||
self.last_focus_failure_count = 0
|
||||
else:
|
||||
# Use basic functions on non-Windows platforms
|
||||
current_pos = (window.left, window.top)
|
||||
current_size = (window.width, window.height)
|
||||
target_pos = (self.window_x, self.window_y)
|
||||
target_size = (self.window_width, self.window_height)
|
||||
|
||||
if current_pos != target_pos or current_size != target_size:
|
||||
window.moveTo(target_pos[0], target_pos[1])
|
||||
window.resizeTo(target_size[0], target_size[1])
|
||||
current_message += f"Adjusted game window to position {target_pos} size {target_size[0]}x{target_size[1]}. "
|
||||
adjustment_made = True
|
||||
|
||||
# Try activating the window (may have limited effect on non-Windows)
|
||||
try:
|
||||
window.activate()
|
||||
current_message += "Attempted to activate game window. "
|
||||
adjustment_made = True
|
||||
except Exception as activate_err:
|
||||
self.logger.warning(f"Error activating window: {activate_err}")
|
||||
|
||||
except Exception as e:
|
||||
self.logger.error(f"Unexpected error while monitoring game window: {e}")
|
||||
|
||||
# Log only if adjustments were made and the message changed
|
||||
if adjustment_made and current_message and current_message != last_adjustment_message:
|
||||
self.logger.info(f"[GameMonitor] {current_message.strip()}")
|
||||
last_adjustment_message = current_message
|
||||
elif not window:
|
||||
# Reset last message if window disappears
|
||||
last_adjustment_message = ""
|
||||
|
||||
except Exception as e:
|
||||
self.logger.error(f"Error in monitoring loop: {e}")
|
||||
|
||||
# Wait for the next check
|
||||
time.sleep(self.monitor_interval)
|
||||
|
||||
self.logger.info("Game window monitoring loop finished")
|
||||
|
||||
def _is_game_running(self):
|
||||
"""Check if game is running"""
|
||||
if not HAS_PSUTIL:
|
||||
self.logger.warning("_is_game_running: psutil not available, cannot check process status.")
|
||||
return True # Assume running if psutil is not available to avoid unintended restarts
|
||||
try:
|
||||
return any(p.name().lower() == self.game_process_name.lower() for p in psutil.process_iter(['name']))
|
||||
except Exception as e:
|
||||
self.logger.error(f"Error checking game process: {e}")
|
||||
return False # Assume not running on error
|
||||
|
||||
def _find_game_window(self):
|
||||
"""Find the game window with the specified title"""
|
||||
try:
|
||||
windows = gw.getWindowsWithTitle(self.window_title)
|
||||
if windows:
|
||||
return windows[0]
|
||||
except Exception as e:
|
||||
self.logger.debug(f"Error finding game window: {e}")
|
||||
return None
|
||||
|
||||
def _force_window_foreground(self, hwnd, window):
|
||||
"""Aggressive window focus implementation"""
|
||||
if not HAS_WIN32:
|
||||
return False, "win32 modules unavailable"
|
||||
|
||||
success = False
|
||||
methods_tried = []
|
||||
|
||||
# Method 1: HWND_TOPMOST strategy
|
||||
methods_tried.append("HWND_TOPMOST")
|
||||
try:
|
||||
win32gui.SetWindowPos(hwnd, win32con.HWND_TOPMOST, 0, 0, 0, 0,
|
||||
win32con.SWP_NOMOVE | win32con.SWP_NOSIZE)
|
||||
time.sleep(0.1)
|
||||
win32gui.SetWindowPos(hwnd, win32con.HWND_TOP, 0, 0, 0, 0,
|
||||
win32con.SWP_NOMOVE | win32con.SWP_NOSIZE)
|
||||
|
||||
win32gui.SetForegroundWindow(hwnd)
|
||||
time.sleep(0.2)
|
||||
if win32gui.GetForegroundWindow() == hwnd:
|
||||
return True, "HWND_TOPMOST"
|
||||
except Exception as e:
|
||||
self.logger.debug(f"Method 1 failed: {e}")
|
||||
|
||||
# Method 2: Minimize/restore cycle
|
||||
methods_tried.append("MinimizeRestore")
|
||||
try:
|
||||
win32gui.ShowWindow(hwnd, win32con.SW_MINIMIZE)
|
||||
time.sleep(0.3)
|
||||
win32gui.ShowWindow(hwnd, win32con.SW_RESTORE)
|
||||
time.sleep(0.2)
|
||||
win32gui.SetForegroundWindow(hwnd)
|
||||
|
||||
if win32gui.GetForegroundWindow() == hwnd:
|
||||
return True, "MinimizeRestore"
|
||||
except Exception as e:
|
||||
self.logger.debug(f"Method 2 failed: {e}")
|
||||
|
||||
# Method 3: Thread input attach
|
||||
methods_tried.append("ThreadAttach")
|
||||
try:
|
||||
import win32process
|
||||
import win32api
|
||||
|
||||
current_thread_id = win32api.GetCurrentThreadId()
|
||||
window_thread_id = win32process.GetWindowThreadProcessId(hwnd)[0]
|
||||
|
||||
if current_thread_id != window_thread_id:
|
||||
win32process.AttachThreadInput(current_thread_id, window_thread_id, True)
|
||||
try:
|
||||
win32gui.BringWindowToTop(hwnd)
|
||||
win32gui.SetForegroundWindow(hwnd)
|
||||
|
||||
time.sleep(0.2)
|
||||
if win32gui.GetForegroundWindow() == hwnd:
|
||||
return True, "ThreadAttach"
|
||||
finally:
|
||||
win32process.AttachThreadInput(current_thread_id, window_thread_id, False)
|
||||
except Exception as e:
|
||||
self.logger.debug(f"Method 3 failed: {e}")
|
||||
|
||||
# Method 4: Flash + Window messages
|
||||
methods_tried.append("Flash+Messages")
|
||||
try:
|
||||
# First flash to get attention
|
||||
win32gui.FlashWindow(hwnd, True)
|
||||
time.sleep(0.2)
|
||||
|
||||
# Then send specific window messages
|
||||
win32gui.SendMessage(hwnd, win32con.WM_SETREDRAW, 0, 0)
|
||||
win32gui.SendMessage(hwnd, win32con.WM_SETREDRAW, 1, 0)
|
||||
win32gui.RedrawWindow(hwnd, None, None,
|
||||
win32con.RDW_FRAME | win32con.RDW_INVALIDATE |
|
||||
win32con.RDW_UPDATENOW | win32con.RDW_ALLCHILDREN)
|
||||
|
||||
win32gui.PostMessage(hwnd, win32con.WM_SYSCOMMAND, win32con.SC_RESTORE, 0)
|
||||
win32gui.PostMessage(hwnd, win32con.WM_ACTIVATE, win32con.WA_ACTIVE, 0)
|
||||
|
||||
time.sleep(0.2)
|
||||
if win32gui.GetForegroundWindow() == hwnd:
|
||||
return True, "Flash+Messages"
|
||||
except Exception as e:
|
||||
self.logger.debug(f"Method 4 failed: {e}")
|
||||
|
||||
# Method 5: Hide/Show cycle
|
||||
methods_tried.append("HideShow")
|
||||
try:
|
||||
win32gui.ShowWindow(hwnd, win32con.SW_HIDE)
|
||||
time.sleep(0.2)
|
||||
win32gui.ShowWindow(hwnd, win32con.SW_SHOW)
|
||||
time.sleep(0.2)
|
||||
win32gui.SetForegroundWindow(hwnd)
|
||||
|
||||
if win32gui.GetForegroundWindow() == hwnd:
|
||||
return True, "HideShow"
|
||||
except Exception as e:
|
||||
self.logger.debug(f"Method 5 failed: {e}")
|
||||
|
||||
return False, f"All methods failed: {', '.join(methods_tried)}"
|
||||
|
||||
def _find_game_process_by_window(self):
|
||||
"""Find process using both window title and process name"""
|
||||
if not HAS_PSUTIL or not HAS_WIN32:
|
||||
return None
|
||||
|
||||
try:
|
||||
window = self._find_game_window()
|
||||
if not window:
|
||||
return None
|
||||
|
||||
hwnd = window._hWnd
|
||||
window_pid = None
|
||||
try:
|
||||
import win32process
|
||||
_, window_pid = win32process.GetWindowThreadProcessId(hwnd)
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
if window_pid:
|
||||
try:
|
||||
proc = psutil.Process(window_pid)
|
||||
proc_name = proc.name()
|
||||
|
||||
if proc_name.lower() == self.game_process_name.lower():
|
||||
self.logger.info(f"Found game process '{proc_name}' (PID: {proc.pid}) with window title '{self.window_title}'")
|
||||
return proc
|
||||
else:
|
||||
self.logger.debug(f"Window process name mismatch: expected '{self.game_process_name}', got '{proc_name}'")
|
||||
return proc # Returning proc even if name mismatches, as per user's code.
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Fallback to name-based search if window-based fails or PID doesn't match process name.
|
||||
# The user's provided code implies a fallback to _find_game_process_by_name()
|
||||
# This will be handled by the updated _find_game_process method.
|
||||
# For now, if the window PID didn't lead to a matching process name, we return None here.
|
||||
# The original code had "return self._find_game_process_by_name()" here,
|
||||
# but that would create a direct dependency. The new _find_game_process handles the fallback.
|
||||
# So, if we reach here, it means the window was found, PID was obtained, but process name didn't match.
|
||||
# The original code returns `proc` even on mismatch, so I'll keep that.
|
||||
# If `window_pid` was None or `psutil.Process(window_pid)` failed, it would have returned None or passed.
|
||||
# The logic "return self._find_game_process_by_name()" was in the original snippet,
|
||||
# I will include it here as per the snippet, but note that the overall _find_game_process will also call it.
|
||||
return self._find_game_process_by_name() # As per user snippet
|
||||
|
||||
except Exception as e:
|
||||
self.logger.error(f"Process-by-window lookup error: {e}")
|
||||
return None
|
||||
|
||||
def _find_game_process(self):
|
||||
"""Find game process with combined approach"""
|
||||
# Try window-based process lookup first
|
||||
proc = self._find_game_process_by_window()
|
||||
if proc:
|
||||
return proc
|
||||
|
||||
# Fall back to name-only lookup
|
||||
# This is the original _find_game_process logic, now as a fallback.
|
||||
if not HAS_PSUTIL:
|
||||
self.logger.debug("psutil not available for name-only process lookup fallback.") # Changed to debug as primary is window based
|
||||
return None
|
||||
try:
|
||||
for p_iter in psutil.process_iter(['pid', 'name', 'exe']):
|
||||
try:
|
||||
proc_info = p_iter.info
|
||||
proc_name = proc_info.get('name')
|
||||
if proc_name and proc_name.lower() == self.game_process_name.lower():
|
||||
self.logger.info(f"Found game process by name '{proc_name}' (PID: {p_iter.pid}) as fallback")
|
||||
return p_iter
|
||||
except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess):
|
||||
continue
|
||||
except Exception as e:
|
||||
self.logger.error(f"Error in name-only game process lookup: {e}")
|
||||
|
||||
self.logger.info(f"Game process '{self.game_process_name}' not found by name either.")
|
||||
return None
|
||||
|
||||
def _perform_restart(self):
|
||||
"""Execute the game restart process"""
|
||||
self.logger.info("Starting game restart process")
|
||||
|
||||
try:
|
||||
# 1. Notify that restart has begun (optional)
|
||||
if self.callback:
|
||||
self.callback("restart_begin")
|
||||
|
||||
# 2. Terminate existing game process
|
||||
self._terminate_game_process()
|
||||
time.sleep(2) # Short wait to ensure process termination
|
||||
|
||||
# 3. Start new game process
|
||||
if self._start_game_process():
|
||||
self.logger.info("Game restarted successfully")
|
||||
else:
|
||||
self.logger.error("Failed to start game")
|
||||
|
||||
# 4. Wait for game to launch
|
||||
restart_wait_time = 45 # seconds, increased from 30
|
||||
self.logger.info(f"Waiting for game to start ({restart_wait_time} seconds)...")
|
||||
time.sleep(restart_wait_time)
|
||||
|
||||
# 5. Notify restart completion
|
||||
self.logger.info("Game restart process completed, sending notification")
|
||||
if self.callback:
|
||||
self.callback("restart_complete")
|
||||
|
||||
return True
|
||||
except Exception as e:
|
||||
self.logger.error(f"Error during game restart process: {e}")
|
||||
# Attempt to notify error
|
||||
if self.callback:
|
||||
self.callback("restart_error")
|
||||
return False
|
||||
|
||||
def _terminate_game_process(self):
|
||||
"""Terminate the game process"""
|
||||
self.logger.info(f"Attempting to terminate game process '{self.game_process_name}'")
|
||||
|
||||
if not HAS_PSUTIL:
|
||||
self.logger.warning("psutil is not available, cannot terminate process")
|
||||
return False
|
||||
|
||||
process = self._find_game_process()
|
||||
terminated = False
|
||||
|
||||
if process:
|
||||
try:
|
||||
self.logger.info(f"Found game process PID: {process.pid}, terminating...")
|
||||
process.terminate()
|
||||
|
||||
try:
|
||||
process.wait(timeout=5)
|
||||
self.logger.info(f"Process {process.pid} terminated successfully (terminate)")
|
||||
terminated = True
|
||||
except psutil.TimeoutExpired:
|
||||
self.logger.warning(f"Process {process.pid} did not terminate within 5s (terminate), attempting force kill")
|
||||
process.kill()
|
||||
process.wait(timeout=5)
|
||||
self.logger.info(f"Process {process.pid} force killed (kill)")
|
||||
terminated = True
|
||||
except Exception as e:
|
||||
self.logger.error(f"Error terminating process: {e}")
|
||||
else:
|
||||
self.logger.warning(f"No running process found with name '{self.game_process_name}'")
|
||||
|
||||
return terminated
|
||||
|
||||
def _start_game_process(self):
|
||||
"""Start the game process"""
|
||||
if not self.game_path:
|
||||
self.logger.error("Game executable path not set, cannot start")
|
||||
return False
|
||||
|
||||
self.logger.info(f"Starting game: {self.game_path}")
|
||||
try:
|
||||
if sys.platform == "win32":
|
||||
os.startfile(self.game_path)
|
||||
self.logger.info("Called os.startfile to launch game")
|
||||
return True
|
||||
else:
|
||||
# Use subprocess.Popen for non-Windows platforms
|
||||
# Ensure it runs detached if possible, or handle appropriately
|
||||
subprocess.Popen([self.game_path], start_new_session=True) # Attempt detached start
|
||||
self.logger.info("Called subprocess.Popen to launch game")
|
||||
return True
|
||||
except FileNotFoundError:
|
||||
self.logger.error(f"Startup error: Game launcher '{self.game_path}' not found")
|
||||
except OSError as ose:
|
||||
self.logger.error(f"Startup error (OSError): {ose} - Check path and permissions", exc_info=True)
|
||||
except Exception as e:
|
||||
self.logger.error(f"Unexpected error starting game: {e}", exc_info=True)
|
||||
|
||||
return False
|
||||
|
||||
def restart_now(self):
|
||||
"""Perform an immediate restart"""
|
||||
self.logger.info("Manually triggering game restart")
|
||||
result = self._perform_restart()
|
||||
|
||||
# Reset the timer if scheduled restart is enabled
|
||||
if self.enable_restart and self.restart_interval > 0:
|
||||
self.next_restart_time = time.time() + (self.restart_interval * 60)
|
||||
self.logger.info(f"Restart timer reset. Next restart in {self.restart_interval} minutes")
|
||||
|
||||
return result
|
||||
|
||||
def update_config(self, config_data=None, remote_data=None):
|
||||
"""Update configuration settings"""
|
||||
if config_data:
|
||||
old_config = self.config_data
|
||||
self.config_data = config_data
|
||||
|
||||
# Update key settings
|
||||
self.window_title = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("WINDOW_TITLE", self.window_title)
|
||||
self.enable_restart = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("ENABLE_SCHEDULED_RESTART", self.enable_restart)
|
||||
self.restart_interval = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("RESTART_INTERVAL_MINUTES", self.restart_interval)
|
||||
self.game_path = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_EXECUTABLE_PATH", self.game_path)
|
||||
self.window_x = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_X", self.window_x)
|
||||
self.window_y = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_Y", self.window_y)
|
||||
self.window_width = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_WIDTH", self.window_width)
|
||||
self.window_height = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("GAME_WINDOW_HEIGHT", self.window_height)
|
||||
self.monitor_interval = self.config_data.get("GAME_WINDOW_CONFIG", {}).get("MONITOR_INTERVAL_SECONDS", self.monitor_interval)
|
||||
|
||||
# Reset scheduled restart timer if parameters changed
|
||||
if self.running and self.enable_restart and self.restart_interval > 0:
|
||||
old_interval = old_config.get("GAME_WINDOW_CONFIG", {}).get("RESTART_INTERVAL_MINUTES", 60)
|
||||
if self.restart_interval != old_interval:
|
||||
self.next_restart_time = time.time() + (self.restart_interval * 60)
|
||||
self.logger.info(f"Restart interval updated to {self.restart_interval} minutes, next restart reset")
|
||||
|
||||
if remote_data:
|
||||
self.remote_data = remote_data
|
||||
old_process_name = self.game_process_name
|
||||
self.game_process_name = self.remote_data.get("GAME_PROCESS_NAME", old_process_name)
|
||||
if self.game_process_name != old_process_name:
|
||||
self.logger.info(f"Game process name updated to '{self.game_process_name}'")
|
||||
|
||||
self.logger.info("GameMonitor configuration updated")
|
||||
|
||||
|
||||
# Provide simple external API functions
|
||||
def create_game_monitor(config_data, remote_data=None, logger=None, callback=None):
|
||||
"""Create a game monitor instance"""
|
||||
return GameMonitor(config_data, remote_data, logger, callback)
|
||||
|
||||
def stop_all_monitors():
|
||||
"""Attempt to stop all created monitors (global cleanup)"""
|
||||
# This function could be implemented if instance references are stored.
|
||||
# In the current design, each monitor needs to be stopped individually.
|
||||
pass
|
||||
|
||||
|
||||
# Functionality when run standalone (similar to original game_monitor.py)
|
||||
if __name__ == "__main__":
|
||||
# Set up basic logging
|
||||
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
|
||||
logger = logging.getLogger("GameManagerStandalone")
|
||||
|
||||
# Load settings from config.py
|
||||
try:
|
||||
import config
|
||||
logger.info("Loaded config.py")
|
||||
|
||||
# Build basic configuration dictionary
|
||||
config_data = {
|
||||
"GAME_WINDOW_CONFIG": {
|
||||
"WINDOW_TITLE": config.WINDOW_TITLE,
|
||||
"ENABLE_SCHEDULED_RESTART": config.ENABLE_SCHEDULED_RESTART,
|
||||
"RESTART_INTERVAL_MINUTES": config.RESTART_INTERVAL_MINUTES,
|
||||
"GAME_EXECUTABLE_PATH": config.GAME_EXECUTABLE_PATH,
|
||||
"GAME_WINDOW_X": config.GAME_WINDOW_X,
|
||||
"GAME_WINDOW_Y": config.GAME_WINDOW_Y,
|
||||
"GAME_WINDOW_WIDTH": config.GAME_WINDOW_WIDTH,
|
||||
"GAME_WINDOW_HEIGHT": config.GAME_WINDOW_HEIGHT,
|
||||
"MONITOR_INTERVAL_SECONDS": config.MONITOR_INTERVAL_SECONDS
|
||||
}
|
||||
}
|
||||
|
||||
# Define a callback for standalone execution
|
||||
def standalone_callback(action):
|
||||
"""Send JSON signal via standard output"""
|
||||
logger.info(f"Sending signal: {action}")
|
||||
signal_data = {'action': action}
|
||||
try:
|
||||
json_signal = json.dumps(signal_data)
|
||||
print(json_signal, flush=True)
|
||||
logger.info(f"Signal sent: {action}")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to send signal '{action}': {e}")
|
||||
|
||||
# Create and start the monitor
|
||||
monitor = GameMonitor(config_data, logger=logger, callback=standalone_callback)
|
||||
monitor.start()
|
||||
|
||||
# Keep the program running
|
||||
try:
|
||||
logger.info("Game monitoring started. Press Ctrl+C to stop.")
|
||||
while True:
|
||||
time.sleep(1)
|
||||
except KeyboardInterrupt:
|
||||
logger.info("Ctrl+C received, stopping...")
|
||||
finally:
|
||||
monitor.stop()
|
||||
logger.info("Game monitoring stopped")
|
||||
|
||||
except ImportError:
|
||||
logger.error("Could not load config.py. Ensure it exists and contains necessary settings.")
|
||||
sys.exit(1)
|
||||
except Exception as e:
|
||||
logger.error(f"Error starting game monitoring: {e}", exc_info=True)
|
||||
sys.exit(1)
|
||||
284
game_monitor.py
284
game_monitor.py
@ -1,284 +0,0 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Game Window Monitor Module
|
||||
|
||||
Continuously monitors the game window specified in the config,
|
||||
ensuring it stays at the configured position, size, and remains topmost.
|
||||
"""
|
||||
|
||||
import time
|
||||
import datetime # Added
|
||||
import subprocess # Added
|
||||
import psutil # Added
|
||||
import sys # Added
|
||||
import json # Added
|
||||
import os # Added for basename
|
||||
import pygetwindow as gw
|
||||
import win32gui
|
||||
import win32con
|
||||
import config
|
||||
import logging
|
||||
# import multiprocessing # Keep for Pipe/Queue if needed later, though using stdio now
|
||||
# NOTE: config.py should handle dotenv loading. This script only imports values.
|
||||
|
||||
# --- Setup Logging ---
|
||||
monitor_logger = logging.getLogger('GameMonitor')
|
||||
monitor_logger.setLevel(logging.INFO) # Set level for the logger
|
||||
log_formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
|
||||
# Create handler for stderr
|
||||
stderr_handler = logging.StreamHandler(sys.stderr) # Explicitly use stderr
|
||||
stderr_handler.setFormatter(log_formatter)
|
||||
# Add handler to the logger
|
||||
if not monitor_logger.hasHandlers(): # Avoid adding multiple handlers if run multiple times
|
||||
monitor_logger.addHandler(stderr_handler)
|
||||
monitor_logger.propagate = False # Prevent propagation to root logger if basicConfig was called elsewhere
|
||||
|
||||
# --- Helper Functions ---
|
||||
|
||||
def restart_game_process():
|
||||
"""Finds and terminates the existing game process, then restarts it."""
|
||||
monitor_logger.info("嘗試重啟遊戲進程。(Attempting to restart game process.)")
|
||||
game_path = config.GAME_EXECUTABLE_PATH
|
||||
if not game_path or not os.path.exists(os.path.dirname(game_path)): # Basic check
|
||||
monitor_logger.error(f"遊戲執行檔路徑 '{game_path}' 無效或目錄不存在,無法重啟。(Game executable path '{game_path}' is invalid or directory does not exist, cannot restart.)")
|
||||
return
|
||||
|
||||
target_process_name = "LastWar.exe" # Correct process name
|
||||
launcher_path = config.GAME_EXECUTABLE_PATH # Keep launcher path for restarting
|
||||
monitor_logger.info(f"尋找名稱為 '{target_process_name}' 的遊戲進程。(Looking for game process named '{target_process_name}')")
|
||||
|
||||
terminated = False
|
||||
process_found = False
|
||||
for proc in psutil.process_iter(['pid', 'name', 'exe']):
|
||||
try:
|
||||
proc_info = proc.info
|
||||
proc_name = proc_info.get('name')
|
||||
|
||||
if proc_name == target_process_name:
|
||||
process_found = True
|
||||
monitor_logger.info(f"找到遊戲進程 PID: {proc_info['pid']},名稱: {proc_name}。正在終止...(Found game process PID: {proc_info['pid']}, Name: {proc_name}. Terminating...)")
|
||||
proc.terminate()
|
||||
try:
|
||||
proc.wait(timeout=5)
|
||||
monitor_logger.info(f"進程 {proc_info['pid']} 已成功終止 (terminate)。(Process {proc_info['pid']} terminated successfully (terminate).)")
|
||||
terminated = True
|
||||
except psutil.TimeoutExpired:
|
||||
monitor_logger.warning(f"進程 {proc_info['pid']} 未能在 5 秒內終止 (terminate),嘗試強制結束 (kill)。(Process {proc_info['pid']} did not terminate in 5s (terminate), attempting kill.)")
|
||||
proc.kill()
|
||||
proc.wait(timeout=5) # Wait for kill with timeout
|
||||
monitor_logger.info(f"進程 {proc_info['pid']} 已強制結束 (kill)。(Process {proc_info['pid']} killed.)")
|
||||
terminated = True
|
||||
except Exception as wait_kill_err:
|
||||
monitor_logger.error(f"等待進程 {proc_info['pid']} 強制結束時出錯: {wait_kill_err}", exc_info=False)
|
||||
|
||||
# Removed Termination Verification - Rely on main loop for eventual state correction
|
||||
monitor_logger.info(f"已處理匹配的進程 PID: {proc_info['pid']},停止搜索。(Processed matching process PID: {proc_info['pid']}, stopping search.)")
|
||||
break # Exit the loop once a process is handled
|
||||
except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess):
|
||||
pass # Process might have already exited, access denied, or is a zombie
|
||||
except Exception as e:
|
||||
pid_str = proc.pid if hasattr(proc, 'pid') else 'N/A'
|
||||
monitor_logger.error(f"檢查或終止進程 PID:{pid_str} 時出錯: {e}", exc_info=False)
|
||||
|
||||
if process_found and not terminated:
|
||||
monitor_logger.error("找到遊戲進程但未能成功終止它。(Found game process but failed to terminate it successfully.)")
|
||||
elif not process_found:
|
||||
monitor_logger.warning(f"未找到名稱為 '{target_process_name}' 的正在運行的進程。(No running process named '{target_process_name}' was found.)")
|
||||
|
||||
# Wait a moment before restarting, use the launcher path from config
|
||||
time.sleep(2)
|
||||
if not launcher_path or not os.path.exists(os.path.dirname(launcher_path)):
|
||||
monitor_logger.error(f"遊戲啟動器路徑 '{launcher_path}' 無效或目錄不存在,無法啟動。(Game launcher path '{launcher_path}' is invalid or directory does not exist, cannot launch.)")
|
||||
return
|
||||
|
||||
monitor_logger.info(f"正在使用啟動器啟動遊戲: {launcher_path} (Launching game using launcher: {launcher_path})")
|
||||
try:
|
||||
if sys.platform == "win32":
|
||||
os.startfile(launcher_path)
|
||||
monitor_logger.info("已調用 os.startfile 啟動遊戲。(os.startfile called to launch game.)")
|
||||
else:
|
||||
subprocess.Popen([launcher_path])
|
||||
monitor_logger.info("已調用 subprocess.Popen 啟動遊戲。(subprocess.Popen called to launch game.)")
|
||||
except FileNotFoundError:
|
||||
monitor_logger.error(f"啟動錯誤:找不到遊戲啟動器 '{launcher_path}'。(Launch Error: Game launcher not found at '{launcher_path}'.)")
|
||||
except OSError as ose:
|
||||
monitor_logger.error(f"啟動錯誤 (OSError): {ose} - 檢查路徑和權限。(Launch Error (OSError): {ose} - Check path and permissions.)", exc_info=True)
|
||||
except Exception as e:
|
||||
monitor_logger.error(f"啟動遊戲時發生未預期錯誤: {e}", exc_info=True)
|
||||
# Don't return False here, let the process continue to send resume signal
|
||||
# Removed Startup Verification - Rely on main loop for eventual state correction
|
||||
# Always return True (or nothing) to indicate the attempt was made
|
||||
return # Or return True, doesn't matter much now
|
||||
|
||||
def perform_scheduled_restart():
|
||||
"""Handles the sequence of pausing UI, restarting game, resuming UI."""
|
||||
monitor_logger.info("開始執行定時重啟流程。(Starting scheduled restart sequence.)")
|
||||
|
||||
# Removed pause_ui signal - UI will handle its own pause/resume based on restart_complete
|
||||
|
||||
try:
|
||||
# 1. Attempt to restart the game (no verification)
|
||||
monitor_logger.info("嘗試執行遊戲重啟。(Attempting game restart process.)")
|
||||
restart_game_process() # Fire and forget restart attempt
|
||||
monitor_logger.info("遊戲重啟嘗試已執行。(Game restart attempt executed.)")
|
||||
|
||||
# 2. Wait fixed time after restart attempt
|
||||
monitor_logger.info("等待 30 秒讓遊戲啟動(無驗證)。(Waiting 30 seconds for game to launch (no verification)...)")
|
||||
time.sleep(30) # Fixed wait
|
||||
|
||||
except Exception as restart_err:
|
||||
monitor_logger.error(f"執行 restart_game_process 時發生未預期錯誤: {restart_err}", exc_info=True)
|
||||
# Continue to finally block even on error
|
||||
|
||||
finally:
|
||||
# 3. Signal main process that restart attempt is complete via stdout
|
||||
monitor_logger.info("發送重啟完成訊號。(Sending restart complete signal.)")
|
||||
restart_complete_signal_data = {'action': 'restart_complete'}
|
||||
try:
|
||||
json_signal = json.dumps(restart_complete_signal_data)
|
||||
print(json_signal, flush=True)
|
||||
monitor_logger.info("已發送重啟完成訊號。(Sent restart complete signal.)")
|
||||
except Exception as e:
|
||||
monitor_logger.error(f"發送重啟完成訊號 '{json_signal}' 失敗: {e}", exc_info=True) # Log signal data on error
|
||||
|
||||
monitor_logger.info("定時重啟流程(包括 finally 塊)執行完畢。(Scheduled restart sequence (including finally block) finished.)")
|
||||
# Configure logger (basic example, adjust as needed)
|
||||
# (Logging setup moved earlier)
|
||||
|
||||
def find_game_window(title=config.WINDOW_TITLE):
|
||||
"""Attempts to find the game window by its title."""
|
||||
try:
|
||||
windows = gw.getWindowsWithTitle(title)
|
||||
if windows:
|
||||
return windows[0]
|
||||
except Exception as e:
|
||||
# Log errors if a logger was configured
|
||||
# monitor_logger.error(f"Error finding window '{title}': {e}")
|
||||
pass # Keep silent if window not found during normal check
|
||||
return None
|
||||
|
||||
def monitor_game_window():
|
||||
"""The main monitoring loop. Now runs directly, not in a thread."""
|
||||
monitor_logger.info("遊戲視窗監控腳本已啟動。(Game window monitoring script started.)")
|
||||
last_adjustment_message = "" # Track last message to avoid spam
|
||||
next_restart_time = None
|
||||
|
||||
# Initialize scheduled restart timer if enabled
|
||||
if config.ENABLE_SCHEDULED_RESTART and config.RESTART_INTERVAL_MINUTES > 0:
|
||||
interval_seconds = config.RESTART_INTERVAL_MINUTES * 60
|
||||
next_restart_time = time.time() + interval_seconds
|
||||
monitor_logger.info(f"已啟用定時重啟,首次重啟將在 {config.RESTART_INTERVAL_MINUTES} 分鐘後執行。(Scheduled restart enabled. First restart in {config.RESTART_INTERVAL_MINUTES} minutes.)")
|
||||
else:
|
||||
monitor_logger.info("未啟用定時重啟功能。(Scheduled restart is disabled.)")
|
||||
|
||||
|
||||
while True: # Run indefinitely until terminated externally
|
||||
# --- Scheduled Restart Check ---
|
||||
if next_restart_time and time.time() >= next_restart_time:
|
||||
monitor_logger.info("到達預定重啟時間。(Scheduled restart time reached.)")
|
||||
perform_scheduled_restart()
|
||||
# Reset timer for the next interval
|
||||
interval_seconds = config.RESTART_INTERVAL_MINUTES * 60
|
||||
next_restart_time = time.time() + interval_seconds
|
||||
monitor_logger.info(f"重啟計時器已重置,下次重啟將在 {config.RESTART_INTERVAL_MINUTES} 分鐘後執行。(Restart timer reset. Next restart in {config.RESTART_INTERVAL_MINUTES} minutes.)")
|
||||
# Continue to next loop iteration after restart sequence
|
||||
time.sleep(config.MONITOR_INTERVAL_SECONDS) # Add a small delay before next check
|
||||
continue
|
||||
|
||||
# --- Regular Window Monitoring ---
|
||||
window = find_game_window()
|
||||
adjustment_made = False
|
||||
current_message = ""
|
||||
|
||||
if window:
|
||||
try:
|
||||
hwnd = window._hWnd # Get the window handle for win32 functions
|
||||
|
||||
# 1. Check and Adjust Position/Size
|
||||
current_pos = (window.left, window.top)
|
||||
current_size = (window.width, window.height)
|
||||
target_pos = (config.GAME_WINDOW_X, config.GAME_WINDOW_Y)
|
||||
target_size = (config.GAME_WINDOW_WIDTH, config.GAME_WINDOW_HEIGHT)
|
||||
|
||||
if current_pos != target_pos or current_size != target_size:
|
||||
window.moveTo(target_pos[0], target_pos[1])
|
||||
window.resizeTo(target_size[0], target_size[1])
|
||||
# Verify if move/resize was successful before logging
|
||||
time.sleep(0.1) # Give window time to adjust
|
||||
window.activate() # Bring window to foreground before checking again
|
||||
time.sleep(0.1)
|
||||
new_pos = (window.left, window.top)
|
||||
new_size = (window.width, window.height)
|
||||
if new_pos == target_pos and new_size == target_size:
|
||||
current_message += f"已將遊戲視窗調整至位置 ({target_pos[0]},{target_pos[1]}) 大小 {target_size[0]}x{target_size[1]}。(Adjusted game window to position {target_pos} size {target_size}.) "
|
||||
adjustment_made = True
|
||||
else:
|
||||
# Log failure if needed
|
||||
# monitor_logger.warning(f"Failed to adjust window. Current: {new_pos} {new_size}, Target: {target_pos} {target_size}")
|
||||
pass # Keep silent on failure for now
|
||||
|
||||
# 2. Check and Set Topmost
|
||||
style = win32gui.GetWindowLong(hwnd, win32con.GWL_EXSTYLE)
|
||||
is_topmost = style & win32con.WS_EX_TOPMOST
|
||||
|
||||
if not is_topmost:
|
||||
# Set topmost, -1 for HWND_TOPMOST, flags = SWP_NOMOVE | SWP_NOSIZE
|
||||
win32gui.SetWindowPos(hwnd, win32con.HWND_TOPMOST, 0, 0, 0, 0,
|
||||
win32con.SWP_NOMOVE | win32con.SWP_NOSIZE)
|
||||
# Verify
|
||||
time.sleep(0.1)
|
||||
new_style = win32gui.GetWindowLong(hwnd, win32con.GWL_EXSTYLE)
|
||||
if new_style & win32con.WS_EX_TOPMOST:
|
||||
current_message += "已將遊戲視窗設為最上層。(Set game window to topmost.)"
|
||||
adjustment_made = True
|
||||
else:
|
||||
# Log failure if needed
|
||||
# monitor_logger.warning("Failed to set window to topmost.")
|
||||
pass # Keep silent
|
||||
|
||||
except gw.PyGetWindowException as e:
|
||||
# Log PyGetWindowException specifically, might indicate window closed during check
|
||||
monitor_logger.warning(f"監控循環中無法訪問視窗屬性 (可能已關閉): {e} (Could not access window properties in monitor loop (may be closed): {e})")
|
||||
except Exception as e:
|
||||
# Log other exceptions during monitoring
|
||||
monitor_logger.error(f"監控遊戲視窗時發生未預期錯誤: {e} (Unexpected error during game window monitoring: {e})", exc_info=True)
|
||||
|
||||
# Log adjustment message only if an adjustment was made and it's different from the last one
|
||||
# This should NOT print JSON signals
|
||||
if adjustment_made and current_message and current_message != last_adjustment_message:
|
||||
# Log the adjustment message instead of printing to stdout
|
||||
monitor_logger.info(f"[GameMonitor] {current_message.strip()}")
|
||||
last_adjustment_message = current_message
|
||||
elif not window:
|
||||
# Reset last message if window disappears
|
||||
last_adjustment_message = ""
|
||||
|
||||
# Wait before the next check
|
||||
time.sleep(config.MONITOR_INTERVAL_SECONDS)
|
||||
|
||||
# This part is theoretically unreachable in the new design as the loop is infinite
|
||||
# and termination is handled externally by the parent process (main.py).
|
||||
# monitor_logger.info("遊戲視窗監控腳本已停止。(Game window monitoring script stopped.)")
|
||||
|
||||
|
||||
# Example usage (if run directly)
|
||||
if __name__ == '__main__':
|
||||
monitor_logger.info("直接運行 game_monitor.py。(Running game_monitor.py directly.)")
|
||||
monitor_logger.info(f"將監控標題為 '{config.WINDOW_TITLE}' 的視窗。(Will monitor window with title '{config.WINDOW_TITLE}')")
|
||||
monitor_logger.info(f"目標位置: ({config.GAME_WINDOW_X}, {config.GAME_WINDOW_Y}), 目標大小: {config.GAME_WINDOW_WIDTH}x{config.GAME_WINDOW_HEIGHT}")
|
||||
monitor_logger.info(f"檢查間隔: {config.MONITOR_INTERVAL_SECONDS} 秒。(Check interval: {config.MONITOR_INTERVAL_SECONDS} seconds.)")
|
||||
if config.ENABLE_SCHEDULED_RESTART:
|
||||
monitor_logger.info(f"定時重啟已啟用,間隔: {config.RESTART_INTERVAL_MINUTES} 分鐘。(Scheduled restart enabled, interval: {config.RESTART_INTERVAL_MINUTES} minutes.)")
|
||||
else:
|
||||
monitor_logger.info("定時重啟已禁用。(Scheduled restart disabled.)")
|
||||
monitor_logger.info("腳本將持續運行,請從啟動它的終端使用 Ctrl+C 或由父進程終止。(Script will run continuously. Stop with Ctrl+C from the launching terminal or termination by parent process.)")
|
||||
|
||||
try:
|
||||
monitor_game_window() # Start the main loop directly
|
||||
except KeyboardInterrupt:
|
||||
monitor_logger.info("收到 Ctrl+C,正在退出...(Received Ctrl+C, exiting...)")
|
||||
except Exception as e:
|
||||
monitor_logger.critical(f"監控過程中發生致命錯誤: {e}", exc_info=True)
|
||||
sys.exit(1) # Exit with error code
|
||||
finally:
|
||||
monitor_logger.info("Game Monitor 腳本執行完畢。(Game Monitor script finished.)")
|
||||
@ -150,25 +150,24 @@ def get_system_prompt(
|
||||
else:
|
||||
# 如果沒有預載入數據,則使用完整記憶協議
|
||||
memory_enforcement = f"""
|
||||
=== CHROMADB MEMORY RETRIEVAL PROTOCOL - Wolfhart Memory Integration
|
||||
To personalize your responses to different users, you MUST follow this memory access protocol internally before responding:
|
||||
=== CHROMADB MEMORY RETRIEVAL PROTOCOL - Wolfhart Memory Integration
|
||||
To personalize your responses to different users, you MUST follow this memory access protocol internally before responding:
|
||||
|
||||
**1. Basic User Retrieval:**
|
||||
- Identify the username from `<CURRENT_MESSAGE>`
|
||||
# 修正:使用 BOT_MEMORY_COLLECTION 來獲取用戶資料
|
||||
- Using the `tool_calls` mechanism, execute: `chroma_query_documents(collection_name: "{config.BOT_MEMORY_COLLECTION}", query_texts: ["{{username}} profile"], n_results: 1)` # 只需最相關的1筆
|
||||
- This step must be completed before any response generation
|
||||
**1. Basic User Retrieval:**
|
||||
- Identify the username from `<CURRENT_MESSAGE>`
|
||||
- Using the `tool_calls` mechanism, execute: `chroma_get_documents(collection_name: "{config.PROFILES_COLLECTION}", ids: ["{{username}}_profile"])`
|
||||
- This step must be completed before any response generation
|
||||
|
||||
**2. Context Expansion:**
|
||||
- Perform additional queries as needed, using the `tool_calls` mechanism:
|
||||
- Relevant conversations: `chroma_query_documents(collection_name: "{config.CONVERSATIONS_COLLECTION}", query_texts: ["{{username}} {{query keywords}}"], n_results: 5)`
|
||||
- Core personality reference: `chroma_query_documents(collection_name: "{config.BOT_MEMORY_COLLECTION}", query_texts: ["Wolfhart {{relevant attitude}}"], n_results: 3)`
|
||||
**2. Context Expansion:**
|
||||
- Perform additional queries as needed, using the `tool_calls` mechanism:
|
||||
- Relevant conversations: `chroma_query_documents(collection_name: "{config.CONVERSATIONS_COLLECTION}", query_texts: ["{{username}} {{query keywords}}"], n_results: 5)`
|
||||
- Core personality reference: `chroma_query_documents(collection_name: "{config.BOT_MEMORY_COLLECTION}", query_texts: ["Wolfhart {{relevant attitude}}"], n_results: 3)`
|
||||
|
||||
**3. Other situation**
|
||||
- You should check related memories when Users mention [capital_position], [capital_administrator_role], [server_hierarchy], [last_war], [winter_war], [excavations], [blueprints], [honor_points], [golden_eggs], or [diamonds], as these represent key game mechanics.
|
||||
**3. Other situation**
|
||||
- You should check related memories when Users mention [capital_position], [capital_administrator_role], [server_hierarchy], [last_war], [winter_war], [excavations], [blueprints], [honor_points], [golden_eggs], or [diamonds], as these represent key game mechanics.
|
||||
|
||||
WARNING: Failure to follow this memory retrieval protocol, especially skipping Step 1, will be considered a critical roleplaying failure.
|
||||
"""
|
||||
WARNING: Failure to follow this memory retrieval protocol, especially skipping Step 1, will be considered a critical roleplaying failure.
|
||||
"""
|
||||
|
||||
# 組合系統提示
|
||||
system_prompt = f"""
|
||||
|
||||
182
main.py
182
main.py
@ -16,6 +16,8 @@ from mcp import ClientSession, StdioServerParameters, types
|
||||
# --- Keyboard Imports ---
|
||||
import threading
|
||||
import time
|
||||
# Import MessageDeduplication from ui_interaction
|
||||
from ui_interaction import MessageDeduplication
|
||||
try:
|
||||
import keyboard # Needs pip install keyboard
|
||||
except ImportError:
|
||||
@ -30,7 +32,6 @@ import llm_interaction
|
||||
# Import UI module
|
||||
import ui_interaction
|
||||
import chroma_client
|
||||
# import game_monitor # No longer importing, will run as subprocess
|
||||
import subprocess # Import subprocess module
|
||||
import signal
|
||||
import platform
|
||||
@ -65,9 +66,6 @@ trigger_queue: ThreadSafeQueue = ThreadSafeQueue() # UI Thread -> Main Loop
|
||||
command_queue: ThreadSafeQueue = ThreadSafeQueue() # Main Loop -> UI Thread
|
||||
# --- End Change ---
|
||||
ui_monitor_task: asyncio.Task | None = None # To track the UI monitor task
|
||||
game_monitor_process: subprocess.Popen | None = None # To store the game monitor subprocess
|
||||
monitor_reader_task: asyncio.Future | None = None # Store the future from run_in_executor
|
||||
stop_reader_event = threading.Event() # Event to signal the reader thread to stop
|
||||
|
||||
# --- Keyboard Shortcut State ---
|
||||
script_paused = False
|
||||
@ -107,16 +105,14 @@ def handle_f8():
|
||||
except Exception as e:
|
||||
print(f"Error sending pause command (F8): {e}")
|
||||
else:
|
||||
print("\n--- F8 pressed: Resuming script, resetting state, and resuming UI monitoring ---")
|
||||
reset_command = {'action': 'reset_state'}
|
||||
print("\n--- F8 pressed: Resuming script and UI monitoring ---")
|
||||
resume_command = {'action': 'resume'}
|
||||
try:
|
||||
main_loop.call_soon_threadsafe(command_queue.put_nowait, reset_command)
|
||||
# Add a small delay? Let's try without first.
|
||||
# time.sleep(0.05) # Short delay between commands if needed
|
||||
main_loop.call_soon_threadsafe(command_queue.put_nowait, resume_command)
|
||||
except Exception as e:
|
||||
print(f"Error sending reset/resume commands (F8): {e}")
|
||||
print(f"Error sending resume command (F8): {e}")
|
||||
|
||||
def handle_f9():
|
||||
"""Handles F9 press: Initiates script shutdown."""
|
||||
@ -149,70 +145,6 @@ def keyboard_listener():
|
||||
# --- End Keyboard Shortcut Handlers ---
|
||||
|
||||
|
||||
# --- Game Monitor Signal Reader (Threaded Blocking Version) ---
|
||||
def read_monitor_output(process: subprocess.Popen, queue: ThreadSafeQueue, loop: asyncio.AbstractEventLoop, stop_event: threading.Event):
|
||||
"""Runs in a separate thread, reads stdout blocking, parses JSON, and puts commands in the queue."""
|
||||
print("Game monitor output reader thread started.")
|
||||
try:
|
||||
while not stop_event.is_set():
|
||||
if not process.stdout:
|
||||
print("[Monitor Reader Thread] Subprocess stdout is None. Exiting thread.")
|
||||
break
|
||||
|
||||
try:
|
||||
# Blocking read - this is fine in a separate thread
|
||||
line = process.stdout.readline()
|
||||
except ValueError:
|
||||
# Can happen if the pipe is closed during readline
|
||||
print("[Monitor Reader Thread] ValueError on readline (pipe likely closed). Exiting thread.")
|
||||
break
|
||||
|
||||
if not line:
|
||||
# EOF reached (process terminated)
|
||||
print("[Monitor Reader Thread] EOF reached on stdout. Exiting thread.")
|
||||
break
|
||||
|
||||
line = line.strip()
|
||||
if line:
|
||||
# Log raw line immediately
|
||||
print(f"[Monitor Reader Thread] Received raw line: '{line}'")
|
||||
try:
|
||||
data = json.loads(line)
|
||||
action = data.get('action')
|
||||
print(f"[Monitor Reader Thread] Parsed action: '{action}'") # Log parsed action
|
||||
if action == 'pause_ui':
|
||||
command = {'action': 'pause'}
|
||||
print(f"[Monitor Reader Thread] Preparing to queue command: {command}") # Log before queueing
|
||||
loop.call_soon_threadsafe(queue.put_nowait, command)
|
||||
print("[Monitor Reader Thread] Pause command queued.") # Log after queueing
|
||||
elif action == 'resume_ui':
|
||||
# Removed direct resume_ui handling - ui_interaction will handle pause/resume based on restart_complete
|
||||
print("[Monitor Reader Thread] Received old 'resume_ui' signal, ignoring.")
|
||||
elif action == 'restart_complete':
|
||||
command = {'action': 'handle_restart_complete'}
|
||||
print(f"[Monitor Reader Thread] Received 'restart_complete' signal, preparing to queue command: {command}")
|
||||
try:
|
||||
loop.call_soon_threadsafe(queue.put_nowait, command)
|
||||
print("[Monitor Reader Thread] 'handle_restart_complete' command queued.")
|
||||
except Exception as q_err:
|
||||
print(f"[Monitor Reader Thread] Error putting 'handle_restart_complete' command in queue: {q_err}")
|
||||
else:
|
||||
print(f"[Monitor Reader Thread] Received unknown action from monitor: {action}")
|
||||
except json.JSONDecodeError:
|
||||
print(f"[Monitor Reader Thread] ERROR: Could not decode JSON from monitor: '{line}'")
|
||||
# Log the raw line that failed to parse
|
||||
# print(f"[Monitor Reader Thread] Raw line that failed JSON decode: '{line}'") # Already logged raw line earlier
|
||||
except Exception as e:
|
||||
print(f"[Monitor Reader Thread] Error processing monitor output: {e}")
|
||||
# No sleep needed here as readline() is blocking
|
||||
except Exception as e:
|
||||
# Catch broader errors in the thread loop itself
|
||||
print(f"[Monitor Reader Thread] Thread loop error: {e}")
|
||||
finally:
|
||||
print("Game monitor output reader thread stopped.")
|
||||
# --- End Game Monitor Signal Reader ---
|
||||
|
||||
|
||||
# --- Chat Logging Function ---
|
||||
def log_chat_interaction(user_name: str, user_message: str, bot_name: str, bot_message: str, bot_thoughts: str | None = None):
|
||||
"""Logs the chat interaction, including optional bot thoughts, to a date-stamped file if enabled."""
|
||||
@ -318,7 +250,7 @@ if platform.system() == "Windows" and win32api and win32con:
|
||||
# --- Cleanup Function ---
|
||||
async def shutdown():
|
||||
"""Gracefully closes connections and stops monitoring tasks/processes."""
|
||||
global wolfhart_persona_details, ui_monitor_task, shutdown_requested, game_monitor_process, monitor_reader_task # Add monitor_reader_task
|
||||
global wolfhart_persona_details, ui_monitor_task, shutdown_requested
|
||||
# Ensure shutdown is requested if called externally (e.g., Ctrl+C)
|
||||
if not shutdown_requested:
|
||||
print("Shutdown initiated externally (e.g., Ctrl+C).")
|
||||
@ -338,42 +270,7 @@ async def shutdown():
|
||||
except Exception as e:
|
||||
print(f"Error while waiting for UI monitoring task cancellation: {e}")
|
||||
|
||||
# 1b. Signal and Wait for Monitor Reader Thread
|
||||
if monitor_reader_task: # Check if the future exists
|
||||
if not stop_reader_event.is_set():
|
||||
print("Signaling monitor output reader thread to stop...")
|
||||
stop_reader_event.set()
|
||||
|
||||
# Wait for the thread to finish (the future returned by run_in_executor)
|
||||
# This might block briefly, but it's necessary to ensure clean thread shutdown
|
||||
# We don't await it directly in the async shutdown, but check if it's done
|
||||
# A better approach might be needed if the thread blocks indefinitely
|
||||
print("Waiting for monitor output reader thread to finish (up to 2s)...")
|
||||
try:
|
||||
# Wait for the future to complete with a timeout
|
||||
await asyncio.wait_for(monitor_reader_task, timeout=2.0)
|
||||
print("Monitor output reader thread finished.")
|
||||
except asyncio.TimeoutError:
|
||||
print("Warning: Monitor output reader thread did not finish within timeout.")
|
||||
except asyncio.CancelledError:
|
||||
print("Monitor output reader future was cancelled.") # Should not happen if we don't cancel it
|
||||
except Exception as e:
|
||||
print(f"Error waiting for monitor reader thread future: {e}")
|
||||
|
||||
# 2. Terminate Game Monitor Subprocess (after signaling reader thread)
|
||||
if game_monitor_process:
|
||||
print("Terminating game monitor subprocess...")
|
||||
try:
|
||||
game_monitor_process.terminate()
|
||||
# Optionally wait for a short period or check return code
|
||||
# game_monitor_process.wait(timeout=1)
|
||||
print("Game monitor subprocess terminated.")
|
||||
except Exception as e:
|
||||
print(f"Error terminating game monitor subprocess: {e}")
|
||||
finally:
|
||||
game_monitor_process = None # Clear the reference
|
||||
|
||||
# 3. Close MCP connections via AsyncExitStack
|
||||
# 2. Close MCP connections via AsyncExitStack
|
||||
# This will trigger the __aexit__ method of stdio_client contexts,
|
||||
# which we assume handles terminating the server subprocesses it started.
|
||||
print(f"Closing MCP Server connections (via AsyncExitStack)...")
|
||||
@ -555,7 +452,7 @@ def initialize_memory_system():
|
||||
# --- Main Async Function ---
|
||||
async def run_main_with_exit_stack():
|
||||
"""Initializes connections, loads persona, starts UI monitor and main processing loop."""
|
||||
global initialization_successful, main_task, loop, wolfhart_persona_details, trigger_queue, ui_monitor_task, shutdown_requested, script_paused, command_queue, game_monitor_process, monitor_reader_task # Add monitor_reader_task to globals
|
||||
global initialization_successful, main_task, loop, wolfhart_persona_details, trigger_queue, ui_monitor_task, shutdown_requested, script_paused, command_queue
|
||||
try:
|
||||
# 1. Load Persona Synchronously (before async loop starts)
|
||||
load_persona_from_file() # Corrected function
|
||||
@ -586,57 +483,38 @@ async def run_main_with_exit_stack():
|
||||
|
||||
# 5. Start UI Monitoring in a separate thread
|
||||
print("\n--- Starting UI monitoring thread ---")
|
||||
# Use the new monitoring loop function, passing both queues
|
||||
# 5c. Create MessageDeduplication instance
|
||||
deduplicator = MessageDeduplication(expiry_seconds=3600) # Default 1 hour
|
||||
|
||||
# Use the new monitoring loop function, passing both queues and the deduplicator
|
||||
monitor_task = loop.create_task(
|
||||
asyncio.to_thread(ui_interaction.run_ui_monitoring_loop, trigger_queue, command_queue), # Pass command_queue
|
||||
asyncio.to_thread(ui_interaction.run_ui_monitoring_loop, trigger_queue, command_queue, deduplicator), # Pass command_queue and deduplicator
|
||||
name="ui_monitor"
|
||||
)
|
||||
ui_monitor_task = monitor_task # Store task reference for shutdown
|
||||
# Note: UI task cancellation is handled in shutdown()
|
||||
|
||||
# 5b. Start Game Window Monitoring as a Subprocess
|
||||
# global game_monitor_process, monitor_reader_task # Already declared global at function start
|
||||
print("\n--- Starting Game Window monitoring as a subprocess ---")
|
||||
try:
|
||||
# Use sys.executable to ensure the same Python interpreter is used
|
||||
# Capture stdout to read signals
|
||||
game_monitor_process = subprocess.Popen(
|
||||
[sys.executable, 'game_monitor.py'],
|
||||
stdout=subprocess.PIPE, # Capture stdout
|
||||
stderr=subprocess.PIPE, # Capture stderr for logging/debugging
|
||||
text=True, # Decode stdout/stderr as text (UTF-8 by default)
|
||||
bufsize=1, # Line buffered
|
||||
# Ensure process creation flags are suitable for Windows if needed
|
||||
# creationflags=subprocess.CREATE_NO_WINDOW # Example: Hide console window
|
||||
)
|
||||
print(f"Game monitor subprocess started (PID: {game_monitor_process.pid}).")
|
||||
# 5b. Game Window Monitoring is now handled by Setup.py
|
||||
|
||||
# Start the thread to read monitor output if process started successfully
|
||||
if game_monitor_process.stdout:
|
||||
# Run the blocking reader function in a separate thread using the default executor
|
||||
monitor_reader_task = loop.run_in_executor(
|
||||
None, # Use default ThreadPoolExecutor
|
||||
read_monitor_output, # The function to run
|
||||
game_monitor_process, # Arguments for the function...
|
||||
command_queue,
|
||||
loop,
|
||||
stop_reader_event # Pass the stop event
|
||||
)
|
||||
print("Monitor output reader thread submitted to executor.")
|
||||
# 5d. Start Periodic Cleanup Timer for Deduplicator
|
||||
def periodic_cleanup():
|
||||
if not shutdown_requested: # Only run if not shutting down
|
||||
print("Main Thread: Running periodic deduplicator cleanup...")
|
||||
deduplicator.purge_expired()
|
||||
# Reschedule the timer
|
||||
cleanup_timer = threading.Timer(600, periodic_cleanup) # 10 minutes
|
||||
cleanup_timer.daemon = True
|
||||
cleanup_timer.start()
|
||||
else:
|
||||
print("Error: Could not access game monitor subprocess stdout.")
|
||||
monitor_reader_task = None
|
||||
|
||||
# Optionally, start a task to read stderr as well for debugging
|
||||
# stderr_reader_task = loop.create_task(read_stderr(game_monitor_process), name="monitor_stderr_reader")
|
||||
|
||||
except FileNotFoundError:
|
||||
print("Error: 'game_monitor.py' not found. Cannot start game monitor subprocess.")
|
||||
game_monitor_process = None
|
||||
except Exception as e:
|
||||
print(f"Error starting game monitor subprocess: {e}")
|
||||
game_monitor_process = None
|
||||
print("Main Thread: Shutdown requested, not rescheduling deduplicator cleanup.")
|
||||
|
||||
print("\n--- Starting periodic deduplicator cleanup timer (10 min interval) ---")
|
||||
initial_cleanup_timer = threading.Timer(600, periodic_cleanup)
|
||||
initial_cleanup_timer.daemon = True
|
||||
initial_cleanup_timer.start()
|
||||
# Note: This timer will run in a separate thread.
|
||||
# Ensure it's handled correctly on shutdown if it holds resources.
|
||||
# Since it's a daemon thread and reschedules itself, it should exit when the main program exits.
|
||||
|
||||
# 6. Start the main processing loop (non-blocking check on queue)
|
||||
print("\n--- Wolfhart chatbot has started (waiting for triggers) ---")
|
||||
|
||||
42
memory_backup.py
Normal file
42
memory_backup.py
Normal file
@ -0,0 +1,42 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
Wolf Chat 記憶備份工具
|
||||
|
||||
用於手動執行記憶備份或啟動定時調度器
|
||||
"""
|
||||
|
||||
import sys
|
||||
import argparse
|
||||
import datetime
|
||||
from memory_manager import run_memory_backup_manual, MemoryScheduler # Updated import
|
||||
import config # Import config to access default schedule times
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description='Wolf Chat 記憶備份工具')
|
||||
parser.add_argument('--backup', action='store_true', help='執行一次性備份 (預設為昨天,除非指定 --date)')
|
||||
parser.add_argument('--date', type=str, help='處理指定日期的日誌 (YYYY-MM-DD格式) for --backup')
|
||||
parser.add_argument('--schedule', action='store_true', help='啟動定時調度器')
|
||||
parser.add_argument('--hour', type=int, help='備份時間(小時,0-23)for --schedule')
|
||||
parser.add_argument('--minute', type=int, help='備份時間(分鐘,0-59)for --schedule')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.backup:
|
||||
# The date logic is now handled inside run_memory_backup_manual
|
||||
run_memory_backup_manual(args.date)
|
||||
elif args.schedule:
|
||||
scheduler = MemoryScheduler()
|
||||
# Use provided hour/minute or fallback to config defaults
|
||||
backup_hour = args.hour if args.hour is not None else getattr(config, 'MEMORY_BACKUP_HOUR', 0)
|
||||
backup_minute = args.minute if args.minute is not None else getattr(config, 'MEMORY_BACKUP_MINUTE', 0)
|
||||
|
||||
scheduler.schedule_daily_backup(backup_hour, backup_minute)
|
||||
scheduler.start()
|
||||
else:
|
||||
print("請指定操作: --backup 或 --schedule")
|
||||
parser.print_help()
|
||||
sys.exit(1)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
783
memory_manager.py
Normal file
783
memory_manager.py
Normal file
@ -0,0 +1,783 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
Wolf Chat 記憶管理模組
|
||||
|
||||
處理聊天記錄解析、記憶生成和ChromaDB寫入的一體化模組
|
||||
"""
|
||||
|
||||
import os
|
||||
import re
|
||||
import json
|
||||
import time
|
||||
import asyncio
|
||||
import datetime
|
||||
import schedule
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Optional, Any, Union, Callable
|
||||
from functools import wraps
|
||||
|
||||
# import chromadb # No longer directly needed by ChromaDBManager
|
||||
# from chromadb.utils import embedding_functions # No longer directly needed by ChromaDBManager
|
||||
from openai import AsyncOpenAI
|
||||
|
||||
import config
|
||||
import chroma_client # Import the centralized chroma client
|
||||
|
||||
# =============================================================================
|
||||
# 重試裝飾器
|
||||
# =============================================================================
|
||||
|
||||
def retry_operation(max_attempts: int = 3, delay: float = 1.0):
|
||||
"""重試裝飾器,用於數據庫操作"""
|
||||
def decorator(func: Callable) -> Callable:
|
||||
@wraps(func)
|
||||
def wrapper(*args, **kwargs) -> Any:
|
||||
attempts = 0
|
||||
last_error = None
|
||||
|
||||
while attempts < max_attempts:
|
||||
try:
|
||||
return func(*args, **kwargs)
|
||||
except Exception as e:
|
||||
attempts += 1
|
||||
last_error = e
|
||||
print(f"操作失敗,嘗試次數 {attempts}/{max_attempts}: {e}")
|
||||
|
||||
if attempts < max_attempts:
|
||||
# 指數退避策略
|
||||
sleep_time = delay * (2 ** (attempts - 1))
|
||||
print(f"等待 {sleep_time:.2f} 秒後重試...")
|
||||
time.sleep(sleep_time)
|
||||
|
||||
print(f"操作失敗達到最大嘗試次數 ({max_attempts}),最後錯誤: {last_error}")
|
||||
# 在生產環境中,您可能希望引發最後一個錯誤或返回一個特定的錯誤指示符
|
||||
# 根據您的需求,返回 False 可能適合某些情況
|
||||
return False # 或者 raise last_error
|
||||
|
||||
return wrapper
|
||||
return decorator
|
||||
|
||||
# =============================================================================
|
||||
# 日誌解析部分
|
||||
# =============================================================================
|
||||
|
||||
def parse_log_file(log_path: str) -> List[Dict[str, str]]:
|
||||
"""解析日誌文件,提取對話內容"""
|
||||
conversations = []
|
||||
|
||||
with open(log_path, 'r', encoding='utf-8') as f:
|
||||
content = f.read()
|
||||
|
||||
# 使用分隔符分割對話
|
||||
dialogue_blocks = content.split('---')
|
||||
|
||||
for block in dialogue_blocks:
|
||||
if not block.strip():
|
||||
continue
|
||||
|
||||
# 解析對話塊
|
||||
timestamp_pattern = r'\[([\d-]+ [\d:]+)\]'
|
||||
user_pattern = r'User \(([^)]+)\): (.+?)(?=\[|$)'
|
||||
bot_thoughts_pattern = r'Bot \(([^)]+)\) Thoughts: (.+?)(?=\[|$)'
|
||||
bot_dialogue_pattern = r'Bot \(([^)]+)\) Dialogue: (.+?)(?=\[|$)'
|
||||
|
||||
# 提取時間戳記
|
||||
timestamp_match = re.search(timestamp_pattern, block)
|
||||
user_match = re.search(user_pattern, block, re.DOTALL)
|
||||
bot_thoughts_match = re.search(bot_thoughts_pattern, block, re.DOTALL)
|
||||
bot_dialogue_match = re.search(bot_dialogue_pattern, block, re.DOTALL)
|
||||
|
||||
if timestamp_match and user_match and bot_dialogue_match:
|
||||
timestamp = timestamp_match.group(1)
|
||||
user_name = user_match.group(1)
|
||||
user_message = user_match.group(2).strip()
|
||||
bot_name = bot_dialogue_match.group(1)
|
||||
bot_message = bot_dialogue_match.group(2).strip()
|
||||
bot_thoughts = bot_thoughts_match.group(2).strip() if bot_thoughts_match else ""
|
||||
|
||||
# 創建對話記錄
|
||||
conversation = {
|
||||
"timestamp": timestamp,
|
||||
"user_name": user_name,
|
||||
"user_message": user_message,
|
||||
"bot_name": bot_name,
|
||||
"bot_message": bot_message,
|
||||
"bot_thoughts": bot_thoughts
|
||||
}
|
||||
|
||||
conversations.append(conversation)
|
||||
|
||||
return conversations
|
||||
|
||||
def get_logs_for_date(date: datetime.date, log_dir: str = "chat_logs") -> List[Dict[str, str]]:
|
||||
"""獲取指定日期的所有日誌文件"""
|
||||
date_str = date.strftime("%Y-%m-%d")
|
||||
log_path = os.path.join(log_dir, f"{date_str}.log")
|
||||
|
||||
if os.path.exists(log_path):
|
||||
return parse_log_file(log_path)
|
||||
return []
|
||||
|
||||
def group_conversations_by_user(conversations: List[Dict[str, str]]) -> Dict[str, List[Dict[str, str]]]:
|
||||
"""按用戶分組對話"""
|
||||
user_conversations = {}
|
||||
|
||||
for conv in conversations:
|
||||
user_name = conv["user_name"]
|
||||
if user_name not in user_conversations:
|
||||
user_conversations[user_name] = []
|
||||
user_conversations[user_name].append(conv)
|
||||
|
||||
return user_conversations
|
||||
|
||||
# =============================================================================
|
||||
# 記憶生成器部分
|
||||
# =============================================================================
|
||||
|
||||
class MemoryGenerator:
|
||||
def __init__(self, profile_model: Optional[str] = None, summary_model: Optional[str] = None):
|
||||
self.profile_client = AsyncOpenAI(
|
||||
api_key=config.OPENAI_API_KEY,
|
||||
base_url=config.OPENAI_API_BASE_URL if config.OPENAI_API_BASE_URL else None,
|
||||
)
|
||||
self.summary_client = AsyncOpenAI(
|
||||
api_key=config.OPENAI_API_KEY,
|
||||
base_url=config.OPENAI_API_BASE_URL if config.OPENAI_API_BASE_URL else None,
|
||||
)
|
||||
self.profile_model = profile_model or getattr(config, 'MEMORY_PROFILE_MODEL', config.LLM_MODEL)
|
||||
self.summary_model = summary_model or getattr(config, 'MEMORY_SUMMARY_MODEL', "mistral-7b-instruct")
|
||||
self.persona_data = self._load_persona_data()
|
||||
|
||||
def _load_persona_data(self, persona_file: str = "persona.json") -> Dict[str, Any]:
|
||||
"""Load persona data from JSON file."""
|
||||
try:
|
||||
with open(persona_file, 'r', encoding='utf-8') as f:
|
||||
return json.load(f)
|
||||
except FileNotFoundError:
|
||||
print(f"Warning: Persona file '{persona_file}' not found. Proceeding without persona data.")
|
||||
return {}
|
||||
except json.JSONDecodeError:
|
||||
print(f"Warning: Error decoding JSON from '{persona_file}'. Proceeding without persona data.")
|
||||
return {}
|
||||
|
||||
async def generate_user_profile(
|
||||
self,
|
||||
user_name: str,
|
||||
conversations: List[Dict[str, str]],
|
||||
existing_profile: Optional[Dict[str, Any]] = None
|
||||
) -> Optional[Dict[str, Any]]:
|
||||
"""Generate or update user profile based on conversations"""
|
||||
system_prompt = self._get_profile_system_prompt(config.PERSONA_NAME, existing_profile)
|
||||
|
||||
# Prepare user conversation records
|
||||
conversation_text = self._format_conversations_for_prompt(conversations)
|
||||
|
||||
user_prompt = f"""
|
||||
Please generate a complete profile for user '{user_name}':
|
||||
|
||||
Conversation history:
|
||||
{conversation_text}
|
||||
|
||||
Please analyze this user based on the conversation history and your personality, and generate or update a profile in JSON format, including:
|
||||
1. User's personality traits
|
||||
2. Relationship with you ({config.PERSONA_NAME})
|
||||
3. Your subjective perception of the user
|
||||
4. Important interaction records
|
||||
5. Any other information you think is important
|
||||
|
||||
Please ensure the output is valid JSON format, using the following format:
|
||||
```json
|
||||
{{
|
||||
"id": "{user_name}_profile",
|
||||
"type": "user_profile",
|
||||
"username": "{user_name}",
|
||||
"content": {{
|
||||
"personality": "User personality traits...",
|
||||
"relationship_with_bot": "Description of relationship with me...",
|
||||
"bot_perception": "My subjective perception of the user...",
|
||||
"notable_interactions": ["Important interaction 1", "Important interaction 2"]
|
||||
}},
|
||||
"last_updated": "YYYY-MM-DD",
|
||||
"metadata": {{
|
||||
"priority": 1.0,
|
||||
"word_count": 0
|
||||
}}
|
||||
}}
|
||||
```
|
||||
|
||||
When evaluating, please pay special attention to my "thoughts" section, as that reflects my true thoughts about the user.
|
||||
"""
|
||||
|
||||
try:
|
||||
response = await self.profile_client.chat.completions.create(
|
||||
model=self.profile_model,
|
||||
messages=[
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": user_prompt}
|
||||
],
|
||||
temperature=0.7
|
||||
)
|
||||
|
||||
# Parse JSON response
|
||||
profile_text = response.choices[0].message.content
|
||||
# Extract JSON part
|
||||
json_match = re.search(r'```json\s*(.*?)\s*```', profile_text, re.DOTALL)
|
||||
if json_match:
|
||||
profile_json_str = json_match.group(1)
|
||||
else:
|
||||
# Try parsing directly
|
||||
profile_json_str = profile_text
|
||||
|
||||
profile_json = json.loads(profile_json_str)
|
||||
|
||||
# After parsing the initial JSON response
|
||||
content_str = json.dumps(profile_json["content"], ensure_ascii=False)
|
||||
if len(content_str) > 5000:
|
||||
# Too long - request a more concise version
|
||||
condensed_prompt = f"Your profile is {len(content_str)} characters. Create a new version under 5000 characters. Keep the same structure but be extremely concise."
|
||||
|
||||
condensed_response = await self.profile_client.chat.completions.create(
|
||||
model=self.profile_model,
|
||||
messages=[
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": user_prompt},
|
||||
{"role": "assistant", "content": profile_json_str},
|
||||
{"role": "user", "content": condensed_prompt}
|
||||
],
|
||||
temperature=0.5
|
||||
)
|
||||
|
||||
# Extract the condensed JSON
|
||||
condensed_text = condensed_response.choices[0].message.content
|
||||
# Parse JSON and update profile_json
|
||||
json_match = re.search(r'```json\s*(.*?)\s*```', condensed_text, re.DOTALL)
|
||||
if json_match:
|
||||
profile_json_str = json_match.group(1)
|
||||
else:
|
||||
profile_json_str = condensed_text
|
||||
profile_json = json.loads(profile_json_str)
|
||||
content_str = json.dumps(profile_json["content"], ensure_ascii=False) # Recalculate content_str
|
||||
|
||||
profile_json["metadata"]["word_count"] = len(content_str)
|
||||
profile_json["last_updated"] = datetime.datetime.now().strftime("%Y-%m-%d")
|
||||
|
||||
return profile_json
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error generating user profile: {e}")
|
||||
return None
|
||||
|
||||
async def generate_conversation_summary(
|
||||
self,
|
||||
user_name: str,
|
||||
conversations: List[Dict[str, str]]
|
||||
) -> Optional[Dict[str, Any]]:
|
||||
"""Generate conversation summary for user"""
|
||||
system_prompt = f"""
|
||||
You are {config.PERSONA_NAME}, an intelligent conversational AI.
|
||||
Your task is to summarize the conversations between you and the user, preserving key information and emotional changes.
|
||||
The summary should be concise yet informative, not exceeding 250 words.
|
||||
"""
|
||||
|
||||
# Prepare user conversation records
|
||||
conversation_text = self._format_conversations_for_prompt(conversations)
|
||||
|
||||
# Generate current date
|
||||
today = datetime.datetime.now().strftime("%Y-%m-%d")
|
||||
|
||||
user_prompt = f"""
|
||||
Please summarize my conversation with user '{user_name}' on {today}:
|
||||
|
||||
{conversation_text}
|
||||
|
||||
Please output in JSON format, as follows:
|
||||
```json
|
||||
{{{{
|
||||
"id": "{user_name}_summary_{today.replace('-', '')}",
|
||||
"type": "dialogue_summary",
|
||||
"date": "{today}",
|
||||
"username": "{user_name}",
|
||||
"content": "Conversation summary content...",
|
||||
"key_points": ["Key point 1", "Key point 2"],
|
||||
"metadata": {{{{
|
||||
"priority": 0.7,
|
||||
"word_count": 0
|
||||
}}}}
|
||||
}}}}
|
||||
```
|
||||
|
||||
The summary should reflect my perspective and views on the conversation, not a neutral third-party perspective.
|
||||
"""
|
||||
|
||||
try:
|
||||
response = await self.summary_client.chat.completions.create(
|
||||
model=self.summary_model,
|
||||
messages=[
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": user_prompt}
|
||||
],
|
||||
temperature=0.5
|
||||
)
|
||||
|
||||
# Parse JSON response
|
||||
summary_text = response.choices[0].message.content
|
||||
# Extract JSON part
|
||||
json_match = re.search(r'```json\s*(.*?)\s*```', summary_text, re.DOTALL)
|
||||
if json_match:
|
||||
summary_json_str = json_match.group(1)
|
||||
else:
|
||||
# Try parsing directly
|
||||
summary_json_str = summary_text
|
||||
|
||||
summary_json = json.loads(summary_json_str)
|
||||
|
||||
# Add or update word count
|
||||
summary_json["metadata"]["word_count"] = len(summary_json["content"])
|
||||
|
||||
return summary_json
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error generating conversation summary: {e}")
|
||||
return None
|
||||
|
||||
def _get_profile_system_prompt(self, bot_name: str, existing_profile: Optional[Dict[str, Any]] = None) -> str:
|
||||
"""Get system prompt for generating user profile"""
|
||||
persona_details = ""
|
||||
if self.persona_data:
|
||||
# Construct a string from persona_data, focusing on key aspects
|
||||
# We can be selective here or dump the whole thing if the model can handle it.
|
||||
# For now, let's include a significant portion.
|
||||
persona_info_to_include = {
|
||||
"name": self.persona_data.get("name"),
|
||||
"personality": self.persona_data.get("personality"),
|
||||
"language_social": self.persona_data.get("language_social"),
|
||||
"values_interests_goals": self.persona_data.get("values_interests_goals"),
|
||||
"preferences_reactions": self.persona_data.get("preferences_reactions")
|
||||
}
|
||||
persona_details = f"""
|
||||
Your detailed persona profile is as follows:
|
||||
```json
|
||||
{json.dumps(persona_info_to_include, ensure_ascii=False, indent=2)}
|
||||
```
|
||||
Please embody this persona when analyzing the user and generating their profile.
|
||||
"""
|
||||
|
||||
system_prompt = f"""
|
||||
You are {bot_name}, an AI assistant with deep analytical capabilities.
|
||||
{persona_details}
|
||||
Your task is to analyze the user's interactions with you, creating user profiles.
|
||||
|
||||
CRITICAL: The ENTIRE profile content must be under 5000 characters total. Be extremely concise.
|
||||
|
||||
The profile should:
|
||||
1. Be completely based on your character's perspective
|
||||
2. Focus only on key personality traits and core relationship dynamics
|
||||
3. Include only the most significant interactions
|
||||
|
||||
The output should be valid JSON format, following the provided template.
|
||||
"""
|
||||
|
||||
if existing_profile:
|
||||
system_prompt += f"""
|
||||
You already have an existing user profile, please update based on this:
|
||||
```json
|
||||
{json.dumps(existing_profile, ensure_ascii=False, indent=2)}
|
||||
```
|
||||
|
||||
Please retain valid information, integrate new observations, and resolve any contradictions or outdated information.
|
||||
"""
|
||||
|
||||
return system_prompt
|
||||
|
||||
def _format_conversations_for_prompt(self, conversations: List[Dict[str, str]]) -> str:
|
||||
"""Format conversation records for prompt"""
|
||||
conversation_text = ""
|
||||
|
||||
for i, conv in enumerate(conversations):
|
||||
conversation_text += f"Conversation {i+1}:\n"
|
||||
conversation_text += f"Time: {conv['timestamp']}\n"
|
||||
conversation_text += f"User ({conv['user_name']}): {conv['user_message']}\n"
|
||||
if conv.get('bot_thoughts'): # Check if bot_thoughts exists
|
||||
conversation_text += f"My thoughts: {conv['bot_thoughts']}\n"
|
||||
conversation_text += f"My response: {conv['bot_message']}\n\n"
|
||||
|
||||
return conversation_text
|
||||
|
||||
# =============================================================================
|
||||
# ChromaDB操作部分
|
||||
# =============================================================================
|
||||
|
||||
class ChromaDBManager:
|
||||
def __init__(self, collection_name: Optional[str] = None):
|
||||
self.collection_name = collection_name or config.BOT_MEMORY_COLLECTION
|
||||
self._db_collection = None # Cache for the collection object
|
||||
|
||||
def _get_db_collection(self):
|
||||
"""Helper to get the collection object from chroma_client"""
|
||||
if self._db_collection is None:
|
||||
# Use the centralized get_collection function
|
||||
self._db_collection = chroma_client.get_collection(self.collection_name)
|
||||
if self._db_collection is None:
|
||||
# This indicates a failure in chroma_client to provide the collection
|
||||
raise RuntimeError(f"Failed to get or create collection '{self.collection_name}' via chroma_client. Check chroma_client logs.")
|
||||
return self._db_collection
|
||||
|
||||
@retry_operation(max_attempts=3, delay=1.0)
|
||||
def upsert_user_profile(self, profile_data: Dict[str, Any]) -> bool:
|
||||
"""寫入或更新用戶檔案"""
|
||||
collection = self._get_db_collection()
|
||||
if not profile_data or not isinstance(profile_data, dict):
|
||||
print("無效的檔案數據")
|
||||
return False
|
||||
|
||||
try:
|
||||
user_id = profile_data.get("id")
|
||||
if not user_id:
|
||||
print("檔案缺少ID字段")
|
||||
return False
|
||||
|
||||
# 準備元數據
|
||||
# Note: ChromaDB's upsert handles existence check implicitly.
|
||||
# The .get call here isn't strictly necessary for the upsert operation itself,
|
||||
# but might be kept if there was other logic depending on prior existence.
|
||||
# For a clean upsert, it can be removed. Let's assume it's not critical for now.
|
||||
# results = collection.get(ids=[user_id], limit=1) # Optional: if needed for pre-check logic
|
||||
|
||||
metadata = {
|
||||
"id": user_id,
|
||||
"type": "user_profile",
|
||||
"username": profile_data.get("username", ""),
|
||||
"priority": 1.0 # 高優先級
|
||||
}
|
||||
|
||||
# 添加其他元數據
|
||||
if "metadata" in profile_data and isinstance(profile_data["metadata"], dict):
|
||||
for k, v in profile_data["metadata"].items():
|
||||
if k not in ["id", "type", "username", "priority"]: # Avoid overwriting key fields
|
||||
# 處理非基本類型的值
|
||||
if isinstance(v, (list, dict, tuple)):
|
||||
# 轉換為字符串
|
||||
metadata[k] = json.dumps(v, ensure_ascii=False)
|
||||
else:
|
||||
metadata[k] = v
|
||||
|
||||
# 序列化內容
|
||||
content_doc = json.dumps(profile_data.get("content", {}), ensure_ascii=False)
|
||||
|
||||
# 寫入或更新
|
||||
collection.upsert(
|
||||
ids=[user_id],
|
||||
documents=[content_doc],
|
||||
metadatas=[metadata]
|
||||
)
|
||||
print(f"Upserted user profile: {user_id} into collection {self.collection_name}")
|
||||
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
print(f"寫入用戶檔案時出錯: {e}")
|
||||
return False
|
||||
|
||||
@retry_operation(max_attempts=3, delay=1.0)
|
||||
def upsert_conversation_summary(self, summary_data: Dict[str, Any]) -> bool:
|
||||
"""寫入對話總結"""
|
||||
collection = self._get_db_collection()
|
||||
if not summary_data or not isinstance(summary_data, dict):
|
||||
print("無效的總結數據")
|
||||
return False
|
||||
|
||||
try:
|
||||
summary_id = summary_data.get("id")
|
||||
if not summary_id:
|
||||
print("總結缺少ID字段")
|
||||
return False
|
||||
|
||||
# 準備元數據
|
||||
metadata = {
|
||||
"id": summary_id,
|
||||
"type": "dialogue_summary",
|
||||
"username": summary_data.get("username", ""),
|
||||
"date": summary_data.get("date", ""),
|
||||
"priority": 0.7 # 低優先級
|
||||
}
|
||||
|
||||
# 添加其他元數據
|
||||
if "metadata" in summary_data and isinstance(summary_data["metadata"], dict):
|
||||
for k, v in summary_data["metadata"].items():
|
||||
if k not in ["id", "type", "username", "date", "priority"]:
|
||||
# 處理非基本類型的值
|
||||
if isinstance(v, (list, dict, tuple)):
|
||||
# 轉換為字符串
|
||||
metadata[k] = json.dumps(v, ensure_ascii=False)
|
||||
else:
|
||||
metadata[k] = v
|
||||
|
||||
# 獲取內容
|
||||
content_doc = summary_data.get("content", "")
|
||||
if "key_points" in summary_data and summary_data["key_points"]:
|
||||
key_points_str = "\n".join([f"- {point}" for point in summary_data["key_points"]])
|
||||
content_doc += f"\n\n關鍵點:\n{key_points_str}"
|
||||
|
||||
# 寫入數據
|
||||
collection.upsert(
|
||||
ids=[summary_id],
|
||||
documents=[content_doc],
|
||||
metadatas=[metadata]
|
||||
)
|
||||
print(f"Upserted conversation summary: {summary_id} into collection {self.collection_name}")
|
||||
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
print(f"寫入對話總結時出錯: {e}")
|
||||
return False
|
||||
|
||||
def get_existing_profile(self, username: str) -> Optional[Dict[str, Any]]:
|
||||
"""獲取現有的用戶檔案"""
|
||||
collection = self._get_db_collection()
|
||||
try:
|
||||
profile_id = f"{username}_profile"
|
||||
results = collection.get(
|
||||
ids=[profile_id],
|
||||
limit=1
|
||||
)
|
||||
|
||||
if results and results["ids"] and results["documents"]:
|
||||
idx = 0
|
||||
# Ensure document is not None before trying to load
|
||||
doc_content = results["documents"][idx]
|
||||
if doc_content is None:
|
||||
print(f"Warning: Document for profile {profile_id} is None.")
|
||||
return None
|
||||
|
||||
profile_data = {
|
||||
"id": profile_id,
|
||||
"type": "user_profile",
|
||||
"username": username,
|
||||
"content": json.loads(doc_content),
|
||||
"last_updated": "", # Will be populated from metadata if exists
|
||||
"metadata": {}
|
||||
}
|
||||
|
||||
# 獲取元數據
|
||||
if results["metadatas"] and results["metadatas"][idx]:
|
||||
metadata_db = results["metadatas"][idx]
|
||||
for k, v in metadata_db.items():
|
||||
if k == "last_updated":
|
||||
profile_data["last_updated"] = str(v) # Ensure it's a string
|
||||
elif k not in ["id", "type", "username"]:
|
||||
profile_data["metadata"][k] = v
|
||||
|
||||
return profile_data
|
||||
|
||||
return None
|
||||
|
||||
except json.JSONDecodeError as je:
|
||||
print(f"Error decoding JSON for profile {username}: {je}")
|
||||
return None
|
||||
except Exception as e:
|
||||
print(f"獲取用戶檔案時出錯 for {username}: {e}")
|
||||
return None
|
||||
|
||||
# =============================================================================
|
||||
# 記憶管理器
|
||||
# =============================================================================
|
||||
|
||||
class MemoryManager:
|
||||
def __init__(self):
|
||||
self.memory_generator = MemoryGenerator(
|
||||
profile_model=getattr(config, 'MEMORY_PROFILE_MODEL', config.LLM_MODEL),
|
||||
summary_model=getattr(config, 'MEMORY_SUMMARY_MODEL', "mistral-7b-instruct")
|
||||
)
|
||||
self.db_manager = ChromaDBManager(collection_name=config.BOT_MEMORY_COLLECTION)
|
||||
# Ensure LOG_DIR is correctly referenced from config
|
||||
self.log_dir = getattr(config, 'LOG_DIR', "chat_logs")
|
||||
|
||||
async def process_daily_logs(self, date: Optional[datetime.date] = None) -> None:
|
||||
"""處理指定日期的日誌(預設為昨天)"""
|
||||
# 如果未指定日期,使用昨天
|
||||
if date is None:
|
||||
date = datetime.datetime.now().date() - datetime.timedelta(days=1)
|
||||
|
||||
date_str = date.strftime("%Y-%m-%d")
|
||||
log_path = os.path.join(self.log_dir, f"{date_str}.log")
|
||||
|
||||
if not os.path.exists(log_path):
|
||||
print(f"找不到日誌文件: {log_path}")
|
||||
return
|
||||
|
||||
print(f"開始處理日誌文件: {log_path}")
|
||||
|
||||
# 解析日誌
|
||||
conversations = parse_log_file(log_path)
|
||||
if not conversations:
|
||||
print(f"日誌文件 {log_path} 為空或未解析到對話。")
|
||||
return
|
||||
print(f"解析到 {len(conversations)} 條對話記錄")
|
||||
|
||||
# 按用戶分組
|
||||
user_conversations = group_conversations_by_user(conversations)
|
||||
print(f"共有 {len(user_conversations)} 個用戶有對話")
|
||||
|
||||
# 為每個用戶生成/更新檔案和對話總結
|
||||
failed_users = []
|
||||
for username, convs in user_conversations.items():
|
||||
print(f"處理用戶 '{username}' 的 {len(convs)} 條對話")
|
||||
|
||||
try:
|
||||
# 獲取現有檔案
|
||||
existing_profile = self.db_manager.get_existing_profile(username)
|
||||
|
||||
# 生成或更新用戶檔案
|
||||
profile_data = await self.memory_generator.generate_user_profile(
|
||||
username, convs, existing_profile
|
||||
)
|
||||
|
||||
if profile_data:
|
||||
profile_success = self.db_manager.upsert_user_profile(profile_data)
|
||||
if not profile_success:
|
||||
print(f"警告: 無法保存用戶 '{username}' 的檔案")
|
||||
|
||||
# 生成對話總結
|
||||
summary_data = await self.memory_generator.generate_conversation_summary(
|
||||
username, convs
|
||||
)
|
||||
|
||||
if summary_data:
|
||||
summary_success = self.db_manager.upsert_conversation_summary(summary_data)
|
||||
if not summary_success:
|
||||
print(f"警告: 無法保存用戶 '{username}' 的對話總結")
|
||||
|
||||
except Exception as e:
|
||||
print(f"處理用戶 '{username}' 時出錯: {e}")
|
||||
failed_users.append(username)
|
||||
continue # 繼續處理下一個用戶
|
||||
|
||||
if failed_users:
|
||||
print(f"以下用戶處理失敗: {', '.join(failed_users)}")
|
||||
print(f"日誌處理完成: {log_path}")
|
||||
|
||||
# =============================================================================
|
||||
# 定時調度器
|
||||
# =============================================================================
|
||||
|
||||
class MemoryScheduler:
|
||||
def __init__(self):
|
||||
self.memory_manager = MemoryManager()
|
||||
self.scheduled = False # To track if a job is already scheduled
|
||||
|
||||
def schedule_daily_backup(self, hour: Optional[int] = None, minute: Optional[int] = None) -> None:
|
||||
"""設置每日備份時間"""
|
||||
# Clear any existing jobs to prevent duplicates if called multiple times
|
||||
schedule.clear()
|
||||
|
||||
backup_hour = hour if hour is not None else getattr(config, 'MEMORY_BACKUP_HOUR', 0)
|
||||
backup_minute = minute if minute is not None else getattr(config, 'MEMORY_BACKUP_MINUTE', 0)
|
||||
|
||||
time_str = f"{backup_hour:02d}:{backup_minute:02d}"
|
||||
|
||||
# 設置定時任務
|
||||
schedule.every().day.at(time_str).do(self._run_daily_backup_job)
|
||||
self.scheduled = True
|
||||
print(f"已設置每日備份時間: {time_str}")
|
||||
|
||||
def _run_daily_backup_job(self) -> None:
|
||||
"""Helper to run the async job for scheduler."""
|
||||
print(f"開始執行每日記憶備份 - {datetime.datetime.now()}")
|
||||
try:
|
||||
# Create a new event loop for the thread if not running in main thread
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.set_event_loop(loop)
|
||||
loop.run_until_complete(self.memory_manager.process_daily_logs())
|
||||
loop.close()
|
||||
print(f"每日記憶備份完成 - {datetime.datetime.now()}")
|
||||
except Exception as e:
|
||||
print(f"執行每日備份時出錯: {e}")
|
||||
# schedule.every().day.at...do() expects the job function to return schedule.CancelJob
|
||||
# if it should not be rescheduled. Otherwise, it's rescheduled.
|
||||
# For a daily job, we want it to reschedule, so we don't return CancelJob.
|
||||
|
||||
def start(self) -> None:
|
||||
"""啟動調度器"""
|
||||
if not self.scheduled:
|
||||
self.schedule_daily_backup() # Schedule with default/config times if not already
|
||||
|
||||
print("調度器已啟動,按Ctrl+C停止")
|
||||
try:
|
||||
while True:
|
||||
schedule.run_pending()
|
||||
time.sleep(1) # Check every second
|
||||
except KeyboardInterrupt:
|
||||
print("調度器已停止")
|
||||
except Exception as e:
|
||||
print(f"調度器運行時發生錯誤: {e}")
|
||||
finally:
|
||||
print("調度器正在關閉...")
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# 直接運行入口
|
||||
# =============================================================================
|
||||
|
||||
def run_memory_backup_manual(date_str: Optional[str] = None) -> None:
|
||||
"""手動執行記憶備份 for a specific date string or yesterday."""
|
||||
target_date = None
|
||||
if date_str:
|
||||
try:
|
||||
target_date = datetime.datetime.strptime(date_str, "%Y-%m-%d").date()
|
||||
except ValueError:
|
||||
print(f"無效的日期格式: {date_str}。將使用昨天的日期。")
|
||||
target_date = datetime.datetime.now().date() - datetime.timedelta(days=1)
|
||||
else:
|
||||
target_date = datetime.datetime.now().date() - datetime.timedelta(days=1)
|
||||
print(f"未指定日期,將處理昨天的日誌: {target_date.strftime('%Y-%m-%d')}")
|
||||
|
||||
memory_manager = MemoryManager()
|
||||
|
||||
# Setup asyncio event loop for the manual run
|
||||
loop = asyncio.get_event_loop()
|
||||
if loop.is_closed():
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.set_event_loop(loop)
|
||||
|
||||
try:
|
||||
loop.run_until_complete(memory_manager.process_daily_logs(target_date))
|
||||
except Exception as e:
|
||||
print(f"手動執行記憶備份時出錯: {e}")
|
||||
finally:
|
||||
# If we created a new loop, we might want to close it.
|
||||
# However, if get_event_loop() returned an existing running loop,
|
||||
# we should not close it here.
|
||||
# For simplicity in a script, this might be okay, but in complex apps, be careful.
|
||||
# loop.close() # Be cautious with this line.
|
||||
pass
|
||||
print("記憶備份完成")
|
||||
|
||||
|
||||
# 如果直接運行此腳本
|
||||
if __name__ == "__main__":
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(description='Wolf Chat 記憶管理模組')
|
||||
parser.add_argument('--backup', action='store_true', help='執行一次性備份 (預設為昨天,除非指定 --date)')
|
||||
parser.add_argument('--date', type=str, help='處理指定日期的日誌 (YYYY-MM-DD格式) for --backup')
|
||||
parser.add_argument('--schedule', action='store_true', help='啟動定時調度器')
|
||||
parser.add_argument('--hour', type=int, help='備份時間(小時,0-23)for --schedule')
|
||||
parser.add_argument('--minute', type=int, help='備份時間(分鐘,0-59)for --schedule')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.backup:
|
||||
run_memory_backup_manual(args.date)
|
||||
elif args.schedule:
|
||||
scheduler = MemoryScheduler()
|
||||
# Pass hour/minute only if they are provided, otherwise defaults in schedule_daily_backup will be used
|
||||
scheduler.schedule_daily_backup(
|
||||
hour=args.hour if args.hour is not None else getattr(config, 'MEMORY_BACKUP_HOUR', 0),
|
||||
minute=args.minute if args.minute is not None else getattr(config, 'MEMORY_BACKUP_MINUTE', 0)
|
||||
)
|
||||
scheduler.start()
|
||||
else:
|
||||
print("請指定操作: --backup 或 --schedule")
|
||||
parser.print_help()
|
||||
529
reembed_chroma_data.py
Normal file
529
reembed_chroma_data.py
Normal file
@ -0,0 +1,529 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
重新嵌入工具 (Reembedding Tool)
|
||||
|
||||
這個腳本用於將現有ChromaDB集合中的數據使用新的嵌入模型重新計算向量並儲存。
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import json
|
||||
import time
|
||||
import argparse
|
||||
import shutil
|
||||
from datetime import datetime
|
||||
from typing import List, Dict, Any, Optional, Tuple
|
||||
from tqdm import tqdm # 進度條
|
||||
|
||||
try:
|
||||
import chromadb
|
||||
from chromadb.utils import embedding_functions
|
||||
except ImportError:
|
||||
print("錯誤: 請先安裝 chromadb: pip install chromadb")
|
||||
sys.exit(1)
|
||||
|
||||
try:
|
||||
from sentence_transformers import SentenceTransformer
|
||||
except ImportError:
|
||||
print("錯誤: 請先安裝 sentence-transformers: pip install sentence-transformers")
|
||||
sys.exit(1)
|
||||
|
||||
# 嘗試導入配置
|
||||
try:
|
||||
import config
|
||||
except ImportError:
|
||||
print("警告: 無法導入config.py,將使用預設設定")
|
||||
# 建立最小配置
|
||||
class MinimalConfig:
|
||||
CHROMA_DATA_DIR = "chroma_data"
|
||||
BOT_MEMORY_COLLECTION = "wolfhart_memory"
|
||||
CONVERSATIONS_COLLECTION = "wolfhart_memory"
|
||||
PROFILES_COLLECTION = "wolfhart_memory"
|
||||
config = MinimalConfig()
|
||||
|
||||
def parse_args():
|
||||
"""處理命令行參數"""
|
||||
parser = argparse.ArgumentParser(description='ChromaDB 數據重新嵌入工具')
|
||||
|
||||
parser.add_argument('--new-model', type=str,
|
||||
default="sentence-transformers/paraphrase-multilingual-mpnet-base-v2",
|
||||
help='新的嵌入模型名稱 (預設: sentence-transformers/paraphrase-multilingual-mpnet-base-v2)')
|
||||
|
||||
parser.add_argument('--collections', type=str, nargs='+',
|
||||
help=f'要處理的集合名稱列表,空白分隔 (預設: 使用配置中的所有集合)')
|
||||
|
||||
parser.add_argument('--backup', action='store_true',
|
||||
help='在處理前備份資料庫 (推薦)')
|
||||
|
||||
parser.add_argument('--batch-size', type=int, default=100,
|
||||
help='批處理大小 (預設: 100)')
|
||||
|
||||
parser.add_argument('--temp-collection-suffix', type=str, default="_temp_new",
|
||||
help='臨時集合的後綴名稱 (預設: _temp_new)')
|
||||
|
||||
parser.add_argument('--dry-run', action='store_true',
|
||||
help='模擬執行但不實際修改資料')
|
||||
|
||||
parser.add_argument('--confirm-dangerous', action='store_true',
|
||||
help='確認執行危險操作(例如刪除集合)')
|
||||
|
||||
return parser.parse_args()
|
||||
|
||||
def backup_chroma_directory(chroma_dir: str) -> str:
|
||||
"""備份ChromaDB數據目錄
|
||||
|
||||
Args:
|
||||
chroma_dir: ChromaDB數據目錄路徑
|
||||
|
||||
Returns:
|
||||
備份目錄的路徑
|
||||
"""
|
||||
if not os.path.exists(chroma_dir):
|
||||
print(f"錯誤: ChromaDB目錄 '{chroma_dir}' 不存在")
|
||||
sys.exit(1)
|
||||
|
||||
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||
backup_dir = f"{chroma_dir}_backup_{timestamp}"
|
||||
|
||||
print(f"備份資料庫從 '{chroma_dir}' 到 '{backup_dir}'...")
|
||||
shutil.copytree(chroma_dir, backup_dir)
|
||||
print(f"備份完成: {backup_dir}")
|
||||
|
||||
return backup_dir
|
||||
|
||||
def create_embedding_function(model_name: str):
|
||||
"""創建嵌入函數
|
||||
|
||||
Args:
|
||||
model_name: 嵌入模型名稱
|
||||
|
||||
Returns:
|
||||
嵌入函數對象
|
||||
"""
|
||||
if not model_name:
|
||||
print("使用ChromaDB預設嵌入模型")
|
||||
return embedding_functions.DefaultEmbeddingFunction()
|
||||
|
||||
print(f"正在加載嵌入模型: {model_name}")
|
||||
try:
|
||||
# 直接使用SentenceTransformerEmbeddingFunction
|
||||
from chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction
|
||||
embedding_function = SentenceTransformerEmbeddingFunction(model_name=model_name)
|
||||
# 預熱模型
|
||||
_ = embedding_function(["."])
|
||||
return embedding_function
|
||||
except Exception as e:
|
||||
print(f"錯誤: 無法加載模型 '{model_name}': {e}")
|
||||
print("退回到預設嵌入模型")
|
||||
return embedding_functions.DefaultEmbeddingFunction()
|
||||
|
||||
def get_collection_names(client, default_collections: List[str]) -> List[str]:
|
||||
"""獲取所有可用的集合名稱
|
||||
|
||||
Args:
|
||||
client: ChromaDB客戶端
|
||||
default_collections: 預設集合列表
|
||||
|
||||
Returns:
|
||||
可用的集合名稱列表
|
||||
"""
|
||||
try:
|
||||
all_collections = client.list_collections()
|
||||
collection_names = [col.name for col in all_collections]
|
||||
|
||||
if collection_names:
|
||||
return collection_names
|
||||
else:
|
||||
print("警告: 沒有找到集合,將使用預設集合")
|
||||
return default_collections
|
||||
|
||||
except Exception as e:
|
||||
print(f"獲取集合列表失敗: {e}")
|
||||
print("將使用預設集合")
|
||||
return default_collections
|
||||
|
||||
def fetch_collection_data(client, collection_name: str, batch_size: int = 100) -> Dict[str, Any]:
|
||||
"""從集合中提取所有數據
|
||||
|
||||
Args:
|
||||
client: ChromaDB客戶端
|
||||
collection_name: 集合名稱
|
||||
batch_size: 批處理大小
|
||||
|
||||
Returns:
|
||||
集合數據字典,包含ids, documents, metadatas
|
||||
"""
|
||||
try:
|
||||
collection = client.get_collection(name=collection_name)
|
||||
|
||||
# 獲取該集合中的項目總數
|
||||
count_result = collection.count()
|
||||
if count_result == 0:
|
||||
print(f"集合 '{collection_name}' 是空的")
|
||||
return {"ids": [], "documents": [], "metadatas": []}
|
||||
|
||||
print(f"從集合 '{collection_name}' 中讀取 {count_result} 項數據...")
|
||||
|
||||
# 分批獲取數據
|
||||
all_ids = []
|
||||
all_documents = []
|
||||
all_metadatas = []
|
||||
|
||||
offset = 0
|
||||
with tqdm(total=count_result, desc=f"正在讀取 {collection_name}") as pbar:
|
||||
while True:
|
||||
# 注意: 使用include參數指定只獲取需要的數據
|
||||
batch_result = collection.get(
|
||||
limit=batch_size,
|
||||
offset=offset,
|
||||
include=["documents", "metadatas"]
|
||||
)
|
||||
|
||||
batch_ids = batch_result.get("ids", [])
|
||||
if not batch_ids:
|
||||
break
|
||||
|
||||
all_ids.extend(batch_ids)
|
||||
all_documents.extend(batch_result.get("documents", []))
|
||||
all_metadatas.extend(batch_result.get("metadatas", []))
|
||||
|
||||
offset += len(batch_ids)
|
||||
pbar.update(len(batch_ids))
|
||||
|
||||
if len(batch_ids) < batch_size:
|
||||
break
|
||||
|
||||
return {
|
||||
"ids": all_ids,
|
||||
"documents": all_documents,
|
||||
"metadatas": all_metadatas
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
print(f"從集合 '{collection_name}' 獲取數據時出錯: {e}")
|
||||
return {"ids": [], "documents": [], "metadatas": []}
|
||||
|
||||
def create_and_populate_collection(
|
||||
client,
|
||||
collection_name: str,
|
||||
data: Dict[str, Any],
|
||||
embedding_func,
|
||||
batch_size: int = 100,
|
||||
dry_run: bool = False
|
||||
) -> bool:
|
||||
"""創建新集合並填充數據
|
||||
|
||||
Args:
|
||||
client: ChromaDB客戶端
|
||||
collection_name: 集合名稱
|
||||
data: 要添加的數據 (ids, documents, metadatas)
|
||||
embedding_func: 嵌入函數
|
||||
batch_size: 批處理大小
|
||||
dry_run: 是否只模擬執行
|
||||
|
||||
Returns:
|
||||
成功返回True,否則返回False
|
||||
"""
|
||||
if dry_run:
|
||||
print(f"[模擬] 將創建集合 '{collection_name}' 並添加 {len(data['ids'])} 項數據")
|
||||
return True
|
||||
|
||||
try:
|
||||
# 檢查集合是否已存在
|
||||
if collection_name in [col.name for col in client.list_collections()]:
|
||||
client.delete_collection(collection_name)
|
||||
|
||||
# 創建新集合
|
||||
collection = client.create_collection(
|
||||
name=collection_name,
|
||||
embedding_function=embedding_func
|
||||
)
|
||||
|
||||
# 如果沒有數據,直接返回
|
||||
if not data["ids"]:
|
||||
print(f"集合 '{collection_name}' 創建完成,但沒有數據添加")
|
||||
return True
|
||||
|
||||
# 分批添加數據
|
||||
total_items = len(data["ids"])
|
||||
with tqdm(total=total_items, desc=f"正在填充 {collection_name}") as pbar:
|
||||
for i in range(0, total_items, batch_size):
|
||||
end_idx = min(i + batch_size, total_items)
|
||||
|
||||
batch_ids = data["ids"][i:end_idx]
|
||||
batch_docs = data["documents"][i:end_idx]
|
||||
batch_meta = data["metadatas"][i:end_idx]
|
||||
|
||||
# 處理可能的None值
|
||||
processed_docs = []
|
||||
for doc in batch_docs:
|
||||
if doc is None:
|
||||
processed_docs.append("") # 使用空字符串替代None
|
||||
else:
|
||||
processed_docs.append(doc)
|
||||
|
||||
collection.add(
|
||||
ids=batch_ids,
|
||||
documents=processed_docs,
|
||||
metadatas=batch_meta
|
||||
)
|
||||
|
||||
pbar.update(end_idx - i)
|
||||
|
||||
print(f"成功將 {total_items} 項數據添加到集合 '{collection_name}'")
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
print(f"創建或填充集合 '{collection_name}' 時出錯: {e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
return False
|
||||
|
||||
def swap_collections(
|
||||
client,
|
||||
original_collection: str,
|
||||
temp_collection: str,
|
||||
confirm_dangerous: bool = False,
|
||||
dry_run: bool = False,
|
||||
embedding_func = None # 添加嵌入函數作為參數
|
||||
) -> bool:
|
||||
"""替換集合(刪除原始集合,將臨時集合重命名為原始集合名)
|
||||
|
||||
Args:
|
||||
client: ChromaDB客戶端
|
||||
original_collection: 原始集合名稱
|
||||
temp_collection: 臨時集合名稱
|
||||
confirm_dangerous: 是否確認危險操作
|
||||
dry_run: 是否只模擬執行
|
||||
embedding_func: 嵌入函數,用於創建新集合
|
||||
|
||||
Returns:
|
||||
成功返回True,否則返回False
|
||||
"""
|
||||
if dry_run:
|
||||
print(f"[模擬] 將替換集合: 刪除 '{original_collection}',重命名 '{temp_collection}' 到 '{original_collection}'")
|
||||
return True
|
||||
|
||||
try:
|
||||
# 檢查是否有確認標誌
|
||||
if not confirm_dangerous:
|
||||
response = input(f"警告: 即將刪除集合 '{original_collection}' 並用 '{temp_collection}' 替換它。確認操作? (y/N): ")
|
||||
if response.lower() != 'y':
|
||||
print("操作已取消")
|
||||
return False
|
||||
|
||||
# 檢查兩個集合是否都存在
|
||||
all_collections = [col.name for col in client.list_collections()]
|
||||
if original_collection not in all_collections:
|
||||
print(f"錯誤: 原始集合 '{original_collection}' 不存在")
|
||||
return False
|
||||
|
||||
if temp_collection not in all_collections:
|
||||
print(f"錯誤: 臨時集合 '{temp_collection}' 不存在")
|
||||
return False
|
||||
|
||||
# 獲取臨時集合的所有數據
|
||||
# 在刪除原始集合之前先獲取臨時集合的所有數據
|
||||
print(f"獲取臨時集合 '{temp_collection}' 的數據...")
|
||||
temp_collection_obj = client.get_collection(temp_collection)
|
||||
temp_data = temp_collection_obj.get(include=["documents", "metadatas"])
|
||||
|
||||
# 刪除原始集合
|
||||
print(f"刪除原始集合 '{original_collection}'...")
|
||||
client.delete_collection(original_collection)
|
||||
|
||||
# 創建一個同名的新集合(與原始集合同名)
|
||||
print(f"創建新集合 '{original_collection}'...")
|
||||
|
||||
# 使用傳入的嵌入函數或臨時集合的嵌入函數
|
||||
embedding_function = embedding_func or temp_collection_obj._embedding_function
|
||||
|
||||
# 創建新的集合
|
||||
original_collection_obj = client.create_collection(
|
||||
name=original_collection,
|
||||
embedding_function=embedding_function
|
||||
)
|
||||
|
||||
# 將數據添加到新集合
|
||||
if temp_data["ids"]:
|
||||
print(f"將 {len(temp_data['ids'])} 項數據從臨時集合複製到新集合...")
|
||||
|
||||
# 處理可能的None值
|
||||
processed_docs = []
|
||||
for doc in temp_data["documents"]:
|
||||
if doc is None:
|
||||
processed_docs.append("")
|
||||
else:
|
||||
processed_docs.append(doc)
|
||||
|
||||
# 使用分批方式添加數據以避免潛在的大數據問題
|
||||
batch_size = 100
|
||||
for i in range(0, len(temp_data["ids"]), batch_size):
|
||||
end = min(i + batch_size, len(temp_data["ids"]))
|
||||
original_collection_obj.add(
|
||||
ids=temp_data["ids"][i:end],
|
||||
documents=processed_docs[i:end],
|
||||
metadatas=temp_data["metadatas"][i:end] if temp_data["metadatas"] else None
|
||||
)
|
||||
|
||||
# 刪除臨時集合
|
||||
print(f"刪除臨時集合 '{temp_collection}'...")
|
||||
client.delete_collection(temp_collection)
|
||||
|
||||
print(f"成功用重新嵌入的數據替換集合 '{original_collection}'")
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
print(f"替換集合時出錯: {e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
return False
|
||||
|
||||
def process_collection(
|
||||
client,
|
||||
collection_name: str,
|
||||
embedding_func,
|
||||
temp_suffix: str,
|
||||
batch_size: int,
|
||||
confirm_dangerous: bool,
|
||||
dry_run: bool
|
||||
) -> bool:
|
||||
"""處理一個集合的完整流程
|
||||
|
||||
Args:
|
||||
client: ChromaDB客戶端
|
||||
collection_name: 要處理的集合名稱
|
||||
embedding_func: 新的嵌入函數
|
||||
temp_suffix: 臨時集合的後綴
|
||||
batch_size: 批處理大小
|
||||
confirm_dangerous: 是否確認危險操作
|
||||
dry_run: 是否只模擬執行
|
||||
|
||||
Returns:
|
||||
處理成功返回True,否則返回False
|
||||
"""
|
||||
print(f"\n{'=' * 60}")
|
||||
print(f"處理集合: '{collection_name}'")
|
||||
print(f"{'=' * 60}")
|
||||
|
||||
# 暫時集合名稱
|
||||
temp_collection_name = f"{collection_name}{temp_suffix}"
|
||||
|
||||
# 1. 獲取原始集合的數據
|
||||
data = fetch_collection_data(client, collection_name, batch_size)
|
||||
|
||||
if not data["ids"]:
|
||||
print(f"集合 '{collection_name}' 為空或不存在,跳過")
|
||||
return True
|
||||
|
||||
# 2. 創建臨時集合並使用新的嵌入模型填充數據
|
||||
success = create_and_populate_collection(
|
||||
client,
|
||||
temp_collection_name,
|
||||
data,
|
||||
embedding_func,
|
||||
batch_size,
|
||||
dry_run
|
||||
)
|
||||
|
||||
if not success:
|
||||
print(f"創建臨時集合 '{temp_collection_name}' 失敗,跳過替換")
|
||||
return False
|
||||
|
||||
# 3. 替換原始集合
|
||||
success = swap_collections(
|
||||
client,
|
||||
collection_name,
|
||||
temp_collection_name,
|
||||
confirm_dangerous,
|
||||
dry_run,
|
||||
embedding_func # 添加嵌入函數作為參數
|
||||
)
|
||||
|
||||
return success
|
||||
|
||||
def main():
|
||||
"""主函數"""
|
||||
args = parse_args()
|
||||
|
||||
# 獲取ChromaDB目錄
|
||||
chroma_dir = getattr(config, "CHROMA_DATA_DIR", "chroma_data")
|
||||
print(f"使用ChromaDB目錄: {chroma_dir}")
|
||||
|
||||
# 備份數據庫(如果請求)
|
||||
if args.backup:
|
||||
backup_chroma_directory(chroma_dir)
|
||||
|
||||
# 創建ChromaDB客戶端
|
||||
try:
|
||||
client = chromadb.PersistentClient(path=chroma_dir)
|
||||
except Exception as e:
|
||||
print(f"錯誤: 無法連接到ChromaDB: {e}")
|
||||
sys.exit(1)
|
||||
|
||||
# 創建嵌入函數
|
||||
embedding_func = create_embedding_function(args.new_model)
|
||||
|
||||
# 確定要處理的集合
|
||||
if args.collections:
|
||||
collections_to_process = args.collections
|
||||
else:
|
||||
# 使用配置中的默認集合或獲取所有可用集合
|
||||
default_collections = [
|
||||
getattr(config, "BOT_MEMORY_COLLECTION", "wolfhart_memory"),
|
||||
getattr(config, "CONVERSATIONS_COLLECTION", "conversations"),
|
||||
getattr(config, "PROFILES_COLLECTION", "user_profiles")
|
||||
]
|
||||
collections_to_process = get_collection_names(client, default_collections)
|
||||
|
||||
# 過濾掉已經是臨時集合的集合名稱
|
||||
filtered_collections = []
|
||||
for collection in collections_to_process:
|
||||
if args.temp_collection_suffix in collection:
|
||||
print(f"警告: 跳過可能的臨時集合 '{collection}'")
|
||||
continue
|
||||
filtered_collections.append(collection)
|
||||
|
||||
collections_to_process = filtered_collections
|
||||
|
||||
if not collections_to_process:
|
||||
print("沒有找到可處理的集合。")
|
||||
sys.exit(0)
|
||||
|
||||
print(f"將處理以下集合: {', '.join(collections_to_process)}")
|
||||
if args.dry_run:
|
||||
print("注意: 執行為乾運行模式,不會實際修改數據")
|
||||
|
||||
# 詢問用戶確認
|
||||
if not args.confirm_dangerous and not args.dry_run:
|
||||
confirm = input("這個操作將使用新的嵌入模型重新計算所有數據。繼續? (y/N): ")
|
||||
if confirm.lower() != 'y':
|
||||
print("操作已取消")
|
||||
sys.exit(0)
|
||||
|
||||
# 處理每個集合
|
||||
start_time = time.time()
|
||||
success_count = 0
|
||||
|
||||
for collection_name in collections_to_process:
|
||||
if process_collection(
|
||||
client,
|
||||
collection_name,
|
||||
embedding_func,
|
||||
args.temp_collection_suffix,
|
||||
args.batch_size,
|
||||
args.confirm_dangerous,
|
||||
args.dry_run
|
||||
):
|
||||
success_count += 1
|
||||
|
||||
# 報告結果
|
||||
elapsed_time = time.time() - start_time
|
||||
print(f"\n{'=' * 60}")
|
||||
print(f"處理完成: {success_count}/{len(collections_to_process)} 個集合成功")
|
||||
print(f"總耗時: {elapsed_time:.2f} 秒")
|
||||
print(f"{'=' * 60}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
155
simple_bubble_dedup.py
Normal file
155
simple_bubble_dedup.py
Normal file
@ -0,0 +1,155 @@
|
||||
import os
|
||||
import json
|
||||
import collections
|
||||
import threading
|
||||
from PIL import Image
|
||||
import imagehash
|
||||
import numpy as np
|
||||
import io
|
||||
|
||||
class SimpleBubbleDeduplication:
|
||||
def __init__(self, storage_file="simple_bubble_dedup.json", max_bubbles=5, threshold=5, hash_size=16):
|
||||
self.storage_file = storage_file
|
||||
self.max_bubbles = max_bubbles # Keep the most recent 5 bubbles
|
||||
self.threshold = threshold # Hash difference threshold (lower values are more strict)
|
||||
self.hash_size = hash_size # Hash size
|
||||
self.lock = threading.Lock()
|
||||
|
||||
# Use OrderedDict to maintain order
|
||||
self.recent_bubbles = collections.OrderedDict()
|
||||
# Load stored bubble hashes
|
||||
self._load_storage()
|
||||
|
||||
def _load_storage(self):
|
||||
"""Load processed bubble hash values from file"""
|
||||
if os.path.exists(self.storage_file):
|
||||
try:
|
||||
with open(self.storage_file, 'r') as f:
|
||||
data = json.load(f)
|
||||
|
||||
# Convert stored data to OrderedDict and load
|
||||
self.recent_bubbles.clear()
|
||||
# Use loaded_count to track loaded items, ensuring we don't exceed max_bubbles
|
||||
loaded_count = 0
|
||||
for bubble_id, bubble_data in data.items():
|
||||
if loaded_count >= self.max_bubbles:
|
||||
break
|
||||
self.recent_bubbles[bubble_id] = {
|
||||
'hash': imagehash.hex_to_hash(bubble_data['hash']),
|
||||
'sender': bubble_data.get('sender', 'Unknown')
|
||||
}
|
||||
loaded_count += 1
|
||||
|
||||
print(f"Loaded {len(self.recent_bubbles)} bubble hash records")
|
||||
except Exception as e:
|
||||
print(f"Failed to load bubble hash records: {e}")
|
||||
self.recent_bubbles.clear()
|
||||
|
||||
def _save_storage(self):
|
||||
"""Save bubble hashes to file"""
|
||||
try:
|
||||
# Create temporary dictionary for saving
|
||||
data_to_save = {}
|
||||
for bubble_id, bubble_data in self.recent_bubbles.items():
|
||||
data_to_save[bubble_id] = {
|
||||
'hash': str(bubble_data['hash']),
|
||||
'sender': bubble_data.get('sender', 'Unknown')
|
||||
}
|
||||
|
||||
with open(self.storage_file, 'w') as f:
|
||||
json.dump(data_to_save, f, indent=2)
|
||||
print(f"Saved {len(data_to_save)} bubble hash records")
|
||||
except Exception as e:
|
||||
print(f"Failed to save bubble hash records: {e}")
|
||||
|
||||
def compute_image_hash(self, bubble_snapshot):
|
||||
"""Calculate perceptual hash of bubble image"""
|
||||
try:
|
||||
# If bubble_snapshot is a PIL.Image object
|
||||
if isinstance(bubble_snapshot, Image.Image):
|
||||
img = bubble_snapshot
|
||||
# If bubble_snapshot is a PyAutoGUI screenshot
|
||||
elif hasattr(bubble_snapshot, 'save'):
|
||||
img = bubble_snapshot
|
||||
# If it's bytes or BytesIO
|
||||
elif isinstance(bubble_snapshot, (bytes, io.BytesIO)):
|
||||
img = Image.open(io.BytesIO(bubble_snapshot) if isinstance(bubble_snapshot, bytes) else bubble_snapshot)
|
||||
# If it's a numpy array
|
||||
elif isinstance(bubble_snapshot, np.ndarray):
|
||||
img = Image.fromarray(bubble_snapshot)
|
||||
else:
|
||||
print(f"Unrecognized image format: {type(bubble_snapshot)}")
|
||||
return None
|
||||
|
||||
# Calculate perceptual hash
|
||||
phash = imagehash.phash(img, hash_size=self.hash_size)
|
||||
return phash
|
||||
except Exception as e:
|
||||
print(f"Failed to calculate image hash: {e}")
|
||||
return None
|
||||
|
||||
def generate_bubble_id(self, bubble_region):
|
||||
"""Generate ID based on bubble region"""
|
||||
return f"bubble_{bubble_region[0]}_{bubble_region[1]}_{bubble_region[2]}_{bubble_region[3]}"
|
||||
|
||||
def is_duplicate(self, bubble_snapshot, bubble_region, sender_name=""):
|
||||
"""Check if bubble is a duplicate"""
|
||||
with self.lock:
|
||||
if bubble_snapshot is None:
|
||||
return False
|
||||
|
||||
# Calculate hash of current bubble
|
||||
current_hash = self.compute_image_hash(bubble_snapshot)
|
||||
if current_hash is None:
|
||||
print("Unable to calculate bubble hash, cannot perform deduplication")
|
||||
return False
|
||||
|
||||
# Generate ID for current bubble
|
||||
bubble_id = self.generate_bubble_id(bubble_region)
|
||||
|
||||
# Check if similar to any known bubbles
|
||||
for stored_id, bubble_data in self.recent_bubbles.items():
|
||||
stored_hash = bubble_data['hash']
|
||||
hash_diff = current_hash - stored_hash
|
||||
|
||||
if hash_diff <= self.threshold:
|
||||
print(f"Detected duplicate bubble (ID: {stored_id}, Hash difference: {hash_diff})")
|
||||
if sender_name:
|
||||
print(f"Sender: {sender_name}, Recorded sender: {bubble_data.get('sender', 'Unknown')}")
|
||||
return True
|
||||
|
||||
# Not a duplicate, add to recent bubbles list
|
||||
self.recent_bubbles[bubble_id] = {
|
||||
'hash': current_hash,
|
||||
'sender': sender_name
|
||||
}
|
||||
|
||||
# If exceeding maximum count, remove oldest item
|
||||
while len(self.recent_bubbles) > self.max_bubbles:
|
||||
self.recent_bubbles.popitem(last=False) # Remove first item (oldest)
|
||||
|
||||
self._save_storage()
|
||||
return False
|
||||
|
||||
def clear_all(self):
|
||||
"""Clear all records"""
|
||||
with self.lock:
|
||||
count = len(self.recent_bubbles)
|
||||
self.recent_bubbles.clear()
|
||||
self._save_storage()
|
||||
print(f"Cleared all {count} bubble records")
|
||||
return count
|
||||
|
||||
def save_debug_image(self, bubble_snapshot, bubble_id, hash_value):
|
||||
"""Save debug image (optional feature)"""
|
||||
try:
|
||||
debug_dir = "bubble_debug"
|
||||
if not os.path.exists(debug_dir):
|
||||
os.makedirs(debug_dir)
|
||||
|
||||
# Save original image
|
||||
img_path = os.path.join(debug_dir, f"{bubble_id}_{hash_value}.png")
|
||||
bubble_snapshot.save(img_path)
|
||||
print(f"Saved debug image: {img_path}")
|
||||
except Exception as e:
|
||||
print(f"Failed to save debug image: {e}")
|
||||
BIN
templates/chat_option.png
Normal file
BIN
templates/chat_option.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 3.3 KiB |
BIN
templates/update_confirm.png
Normal file
BIN
templates/update_confirm.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 5.0 KiB |
@ -412,30 +412,46 @@ class ChromaDBBackup:
|
||||
shutil.rmtree(temp_dir)
|
||||
return False
|
||||
|
||||
def schedule_backup(self, interval: str, description: str = "", keep_count: int = 0) -> bool:
|
||||
def schedule_backup(self, interval: str, description: str = "", keep_count: int = 0, at_time: Optional[str] = None) -> bool:
|
||||
"""排程定期備份
|
||||
|
||||
interval: 備份間隔 - daily, weekly, hourly, 或 自定義 cron 表達式
|
||||
interval: 備份間隔 - daily, weekly, hourly
|
||||
description: 備份描述
|
||||
keep_count: 保留的備份數量,0表示不限制
|
||||
at_time: 執行的時間,格式 "HH:MM" (例如 "14:30"),僅對 daily, weekly, monthly 有效
|
||||
"""
|
||||
job_id = f"scheduled_{interval}_{int(time.time())}"
|
||||
|
||||
# 驗證 at_time 格式
|
||||
if at_time:
|
||||
try:
|
||||
time.strptime(at_time, "%H:%M")
|
||||
except ValueError:
|
||||
self.logger.error(f"無效的時間格式: {at_time}. 請使用 HH:MM 格式.")
|
||||
return False
|
||||
|
||||
# 如果是每小時備份,則忽略 at_time
|
||||
if interval == "hourly":
|
||||
at_time = None
|
||||
|
||||
try:
|
||||
# 根據間隔設置排程
|
||||
if interval == "hourly":
|
||||
schedule.every().hour.do(self._run_scheduled_backup, job_id=job_id, description=description, interval=interval)
|
||||
schedule.every().hour.do(self._run_scheduled_backup, job_id=job_id, description=description, interval=interval, at_time=at_time)
|
||||
elif interval == "daily":
|
||||
schedule.every().day.at("00:00").do(self._run_scheduled_backup, job_id=job_id, description=description, interval=interval)
|
||||
schedule_time = at_time if at_time else "00:00"
|
||||
schedule.every().day.at(schedule_time).do(self._run_scheduled_backup, job_id=job_id, description=description, interval=interval, at_time=at_time)
|
||||
elif interval == "weekly":
|
||||
schedule.every().monday.at("00:00").do(self._run_scheduled_backup, job_id=job_id, description=description, interval=interval)
|
||||
schedule_time = at_time if at_time else "00:00"
|
||||
schedule.every().monday.at(schedule_time).do(self._run_scheduled_backup, job_id=job_id, description=description, interval=interval, at_time=at_time)
|
||||
elif interval == "monthly":
|
||||
schedule_time = at_time if at_time else "00:00"
|
||||
# 每月1日執行
|
||||
schedule.every().day.at("00:00").do(self._check_monthly_schedule, job_id=job_id, description=description, interval=interval)
|
||||
schedule.every().day.at(schedule_time).do(self._check_monthly_schedule, job_id=job_id, description=description, interval=interval, at_time=at_time)
|
||||
else:
|
||||
# 自定義間隔 - 直接使用字符串作為cron表達式
|
||||
self.logger.warning(f"不支援的排程間隔: {interval},改用每日排程")
|
||||
schedule.every().day.at("00:00").do(self._run_scheduled_backup, job_id=job_id, description=description, interval="daily")
|
||||
schedule_time = at_time if at_time else "00:00"
|
||||
schedule.every().day.at(schedule_time).do(self._run_scheduled_backup, job_id=job_id, description=description, interval="daily", at_time=at_time)
|
||||
|
||||
# 存儲排程任務信息
|
||||
self.scheduled_jobs[job_id] = {
|
||||
@ -443,10 +459,11 @@ class ChromaDBBackup:
|
||||
"description": description,
|
||||
"created": datetime.datetime.now(),
|
||||
"keep_count": keep_count,
|
||||
"next_run": self._get_next_run_time(interval)
|
||||
"at_time": at_time, # 新增
|
||||
"next_run": self._get_next_run_time(interval, at_time)
|
||||
}
|
||||
|
||||
self.logger.info(f"已排程 {interval} 備份,任務ID: {job_id}")
|
||||
self.logger.info(f"已排程 {interval} 備份 (時間: {at_time if at_time else '預設'}),任務ID: {job_id}")
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
@ -459,32 +476,66 @@ class ChromaDBBackup:
|
||||
return self._run_scheduled_backup(job_id, description, interval)
|
||||
return None
|
||||
|
||||
def _get_next_run_time(self, interval):
|
||||
def _get_next_run_time(self, interval: str, at_time: Optional[str] = None) -> datetime.datetime:
|
||||
"""獲取下次執行時間"""
|
||||
now = datetime.datetime.now()
|
||||
|
||||
target_hour, target_minute = 0, 0
|
||||
if at_time:
|
||||
try:
|
||||
t = time.strptime(at_time, "%H:%M")
|
||||
target_hour, target_minute = t.tm_hour, t.tm_min
|
||||
except ValueError:
|
||||
# 如果格式錯誤,使用預設時間
|
||||
pass
|
||||
|
||||
if interval == "hourly":
|
||||
return now.replace(minute=0, second=0) + datetime.timedelta(hours=1)
|
||||
# 每小時任務,忽略 at_time,在下一個整點執行
|
||||
next_run_time = now.replace(minute=0, second=0, microsecond=0) + datetime.timedelta(hours=1)
|
||||
# 如果計算出的時間已過,則再加一小時
|
||||
if next_run_time <= now:
|
||||
next_run_time += datetime.timedelta(hours=1)
|
||||
return next_run_time
|
||||
|
||||
elif interval == "daily":
|
||||
return now.replace(hour=0, minute=0, second=0) + datetime.timedelta(days=1)
|
||||
next_run_time = now.replace(hour=target_hour, minute=target_minute, second=0, microsecond=0)
|
||||
if next_run_time <= now: # 如果今天的時間已過,則設為明天
|
||||
next_run_time += datetime.timedelta(days=1)
|
||||
return next_run_time
|
||||
|
||||
elif interval == "weekly":
|
||||
# 計算下個星期一
|
||||
days_ahead = 0 - now.weekday()
|
||||
if days_ahead <= 0:
|
||||
next_run_time = now.replace(hour=target_hour, minute=target_minute, second=0, microsecond=0)
|
||||
days_ahead = 0 - next_run_time.weekday() # 0 is Monday
|
||||
if days_ahead <= 0: # Target day already happened this week
|
||||
days_ahead += 7
|
||||
return now.replace(hour=0, minute=0, second=0) + datetime.timedelta(days=days_ahead)
|
||||
next_run_time += datetime.timedelta(days=days_ahead)
|
||||
# 如果計算出的時間已過 (例如今天是星期一,但設定的時間已過),則設為下下星期一
|
||||
if next_run_time <= now:
|
||||
next_run_time += datetime.timedelta(weeks=1)
|
||||
return next_run_time
|
||||
|
||||
elif interval == "monthly":
|
||||
# 計算下個月1日
|
||||
next_run_time = now.replace(day=1, hour=target_hour, minute=target_minute, second=0, microsecond=0)
|
||||
if now.month == 12:
|
||||
next_month = now.replace(year=now.year+1, month=1, day=1, hour=0, minute=0, second=0)
|
||||
next_run_time = next_run_time.replace(year=now.year + 1, month=1)
|
||||
else:
|
||||
next_month = now.replace(month=now.month+1, day=1, hour=0, minute=0, second=0)
|
||||
return next_month
|
||||
next_run_time = next_run_time.replace(month=now.month + 1)
|
||||
|
||||
# 如果計算出的時間已過 (例如今天是1號,但設定的時間已過),則設為下下個月1號
|
||||
if next_run_time <= now:
|
||||
if next_run_time.month == 12:
|
||||
next_run_time = next_run_time.replace(year=next_run_time.year + 1, month=1)
|
||||
else:
|
||||
next_run_time = next_run_time.replace(month=next_run_time.month + 1)
|
||||
return next_run_time
|
||||
|
||||
# 默認返回明天
|
||||
return now.replace(hour=0, minute=0, second=0) + datetime.timedelta(days=1)
|
||||
default_next_run = now.replace(hour=target_hour, minute=target_minute, second=0, microsecond=0) + datetime.timedelta(days=1)
|
||||
return default_next_run
|
||||
|
||||
def _run_scheduled_backup(self, job_id, description, interval):
|
||||
def _run_scheduled_backup(self, job_id: str, description: str, interval: str, at_time: Optional[str] = None):
|
||||
"""執行排程備份任務"""
|
||||
job_info = self.scheduled_jobs.get(job_id)
|
||||
if not job_info:
|
||||
@ -493,7 +544,7 @@ class ChromaDBBackup:
|
||||
|
||||
try:
|
||||
# 更新下次執行時間
|
||||
self.scheduled_jobs[job_id]["next_run"] = self._get_next_run_time(interval)
|
||||
self.scheduled_jobs[job_id]["next_run"] = self._get_next_run_time(interval, at_time)
|
||||
|
||||
# 執行備份
|
||||
timestamp = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
|
||||
@ -693,7 +744,8 @@ class ChromaDBBackup:
|
||||
"description": job_data["description"],
|
||||
"created": job_data["created"].strftime("%Y-%m-%d %H:%M:%S"),
|
||||
"next_run": job_data["next_run"].strftime("%Y-%m-%d %H:%M:%S") if job_data["next_run"] else "未知",
|
||||
"keep_count": job_data["keep_count"]
|
||||
"keep_count": job_data["keep_count"],
|
||||
"at_time": job_data.get("at_time", "N/A") # 新增
|
||||
}
|
||||
jobs_info.append(job_info)
|
||||
|
||||
@ -967,12 +1019,14 @@ class ChromaDBBackupUI:
|
||||
jobs_frame = ttk.Frame(schedule_frame)
|
||||
jobs_frame.pack(fill=BOTH, expand=YES)
|
||||
|
||||
columns = ("interval", "next_run")
|
||||
columns = ("interval", "next_run", "at_time") # 新增 at_time
|
||||
self.jobs_tree = ttk.Treeview(jobs_frame, columns=columns, show="headings", height=5)
|
||||
self.jobs_tree.heading("interval", text="間隔")
|
||||
self.jobs_tree.heading("next_run", text="下次執行")
|
||||
self.jobs_tree.heading("at_time", text="執行時間") # 新增
|
||||
self.jobs_tree.column("interval", width=100)
|
||||
self.jobs_tree.column("next_run", width=150)
|
||||
self.jobs_tree.column("at_time", width=80) # 新增
|
||||
|
||||
scrollbar = ttk.Scrollbar(jobs_frame, orient=VERTICAL, command=self.jobs_tree.yview)
|
||||
self.jobs_tree.configure(yscrollcommand=scrollbar.set)
|
||||
@ -1164,7 +1218,8 @@ class ChromaDBBackupUI:
|
||||
iid=job["id"], # 使用任務ID作為樹項目ID
|
||||
values=(
|
||||
f"{job['interval']} ({job['description']})",
|
||||
job["next_run"]
|
||||
job["next_run"],
|
||||
job.get("at_time", "N/A") # 新增
|
||||
)
|
||||
)
|
||||
|
||||
@ -1730,7 +1785,7 @@ class ChromaDBBackupUI:
|
||||
# 創建對話框
|
||||
dialog = tk.Toplevel(self.root)
|
||||
dialog.title("排程備份")
|
||||
dialog.geometry("450x450") # 增加高度確保所有元素可見
|
||||
dialog.geometry("450x550") # 增加高度以容納時間選擇器
|
||||
dialog.resizable(False, False)
|
||||
dialog.grab_set()
|
||||
|
||||
@ -1747,17 +1802,17 @@ class ChromaDBBackupUI:
|
||||
|
||||
# 間隔選擇
|
||||
interval_frame = ttk.Frame(main_frame)
|
||||
interval_frame.pack(fill=X, pady=(0, 15))
|
||||
interval_frame.pack(fill=X, pady=(0, 10)) # 減少 pady
|
||||
|
||||
ttk.Label(interval_frame, text="備份間隔:").pack(anchor=W)
|
||||
|
||||
interval_var = tk.StringVar(value="daily")
|
||||
|
||||
intervals = [
|
||||
("每小時", "hourly"),
|
||||
("每小時 (忽略時間設定)", "hourly"), # 提示每小時忽略時間
|
||||
("每天", "daily"),
|
||||
("每週", "weekly"),
|
||||
("每月", "monthly")
|
||||
("每週 (週一)", "weekly"), # 提示每週預設為週一
|
||||
("每月 (1號)", "monthly") # 提示每月預設為1號
|
||||
]
|
||||
|
||||
for text, value in intervals:
|
||||
@ -1766,17 +1821,50 @@ class ChromaDBBackupUI:
|
||||
text=text,
|
||||
variable=interval_var,
|
||||
value=value
|
||||
).pack(anchor=W, padx=(20, 0), pady=2)
|
||||
).pack(anchor=W, padx=(20, 0), pady=1) # 減少 pady
|
||||
|
||||
# 時間選擇 (小時和分鐘)
|
||||
time_frame = ttk.Frame(main_frame)
|
||||
time_frame.pack(fill=X, pady=(5, 10)) # 減少 pady
|
||||
|
||||
ttk.Label(time_frame, text="執行時間 (HH:MM):").pack(side=LEFT, anchor=W)
|
||||
|
||||
hour_var = tk.StringVar(value="00")
|
||||
minute_var = tk.StringVar(value="00")
|
||||
|
||||
# 小時 Spinbox
|
||||
ttk.Spinbox(
|
||||
time_frame,
|
||||
from_=0,
|
||||
to=23,
|
||||
textvariable=hour_var,
|
||||
width=3,
|
||||
format="%02.0f" # 格式化為兩位數
|
||||
).pack(side=LEFT, padx=(5, 0))
|
||||
|
||||
ttk.Label(time_frame, text=":").pack(side=LEFT, padx=2)
|
||||
|
||||
# 分鐘 Spinbox
|
||||
ttk.Spinbox(
|
||||
time_frame,
|
||||
from_=0,
|
||||
to=59,
|
||||
textvariable=minute_var,
|
||||
width=3,
|
||||
format="%02.0f" # 格式化為兩位數
|
||||
).pack(side=LEFT, padx=(0, 5))
|
||||
|
||||
ttk.Label(time_frame, text="(每小時排程將忽略此設定)").pack(side=LEFT, padx=(5,0), anchor=W)
|
||||
|
||||
# 描述
|
||||
ttk.Label(main_frame, text="備份描述:").pack(anchor=W, pady=(0, 5))
|
||||
|
||||
description_var = tk.StringVar(value="排程備份")
|
||||
ttk.Entry(main_frame, textvariable=description_var, width=40).pack(fill=X, pady=(0, 15))
|
||||
ttk.Entry(main_frame, textvariable=description_var, width=40).pack(fill=X, pady=(0, 10)) # 減少 pady
|
||||
|
||||
# 保留數量
|
||||
keep_frame = ttk.Frame(main_frame)
|
||||
keep_frame.pack(fill=X, pady=(0, 15))
|
||||
keep_frame.pack(fill=X, pady=(0, 10)) # 減少 pady
|
||||
|
||||
ttk.Label(keep_frame, text="最多保留備份數量:").pack(side=LEFT)
|
||||
|
||||
@ -1795,13 +1883,12 @@ class ChromaDBBackupUI:
|
||||
).pack(side=LEFT, padx=(5, 0))
|
||||
|
||||
# 分隔線
|
||||
ttk.Separator(main_frame, orient=HORIZONTAL).pack(fill=X, pady=15)
|
||||
ttk.Separator(main_frame, orient=HORIZONTAL).pack(fill=X, pady=10) # 減少 pady
|
||||
|
||||
# 底部按鈕區 - 使用標準按鈕並確保可見性
|
||||
# 底部按鈕區
|
||||
btn_frame = ttk.Frame(main_frame)
|
||||
btn_frame.pack(fill=X, pady=(10, 5))
|
||||
btn_frame.pack(fill=X, pady=(5, 0)) # 減少 pady
|
||||
|
||||
# 取消按鈕 - 使用標準樣式
|
||||
cancel_btn = ttk.Button(
|
||||
btn_frame,
|
||||
text="取消",
|
||||
@ -1810,7 +1897,6 @@ class ChromaDBBackupUI:
|
||||
)
|
||||
cancel_btn.pack(side=LEFT, padx=(0, 10))
|
||||
|
||||
# 確認按鈕 - 使用標準樣式,避免自定義樣式可能的問題
|
||||
create_btn = ttk.Button(
|
||||
btn_frame,
|
||||
text="加入排程",
|
||||
@ -1819,14 +1905,14 @@ class ChromaDBBackupUI:
|
||||
interval_var.get(),
|
||||
description_var.get(),
|
||||
keep_count_var.get(),
|
||||
f"{hour_var.get()}:{minute_var.get()}", # 組合時間字串
|
||||
dialog
|
||||
)
|
||||
)
|
||||
create_btn.pack(side=LEFT)
|
||||
|
||||
# 額外提示以確保用戶知道如何完成操作
|
||||
note_frame = ttk.Frame(main_frame)
|
||||
note_frame.pack(fill=X, pady=(15, 0))
|
||||
note_frame.pack(fill=X, pady=(10, 0)) # 減少 pady
|
||||
|
||||
ttk.Label(
|
||||
note_frame,
|
||||
@ -1834,7 +1920,7 @@ class ChromaDBBackupUI:
|
||||
foreground="blue"
|
||||
).pack()
|
||||
|
||||
def create_schedule(self, interval, description, keep_count_str, dialog):
|
||||
def create_schedule(self, interval, description, keep_count_str, at_time_str, dialog):
|
||||
"""創建備份排程"""
|
||||
dialog.destroy()
|
||||
|
||||
@ -1843,15 +1929,26 @@ class ChromaDBBackupUI:
|
||||
except ValueError:
|
||||
keep_count = 0
|
||||
|
||||
success = self.backup.schedule_backup(interval, description, keep_count)
|
||||
# 驗證時間格式
|
||||
try:
|
||||
time.strptime(at_time_str, "%H:%M")
|
||||
except ValueError:
|
||||
messagebox.showerror("錯誤", f"無效的時間格式: {at_time_str}. 請使用 HH:MM 格式.")
|
||||
self.status_var.set("創建排程失敗: 無效的時間格式")
|
||||
return
|
||||
|
||||
# 如果是每小時排程,則 at_time 設為 None
|
||||
effective_at_time = at_time_str if interval != "hourly" else None
|
||||
|
||||
success = self.backup.schedule_backup(interval, description, keep_count, effective_at_time)
|
||||
|
||||
if success:
|
||||
self.status_var.set(f"已創建 {interval} 備份排程")
|
||||
self.status_var.set(f"已創建 {interval} 備份排程 (時間: {effective_at_time if effective_at_time else '每小時'})")
|
||||
self.refresh_scheduled_jobs()
|
||||
messagebox.showinfo("成功", f"已成功創建 {interval} 備份排程")
|
||||
messagebox.showinfo("成功", f"已成功創建 {interval} 備份排程 (時間: {effective_at_time if effective_at_time else '每小時'})")
|
||||
else:
|
||||
self.status_var.set("創建排程失敗")
|
||||
messagebox.showerror("錯誤", "無法創建備份排程")
|
||||
messagebox.showerror("錯誤", "無法創建備份排程,請檢查日誌。")
|
||||
|
||||
def quick_schedule(self, interval):
|
||||
"""快速創建排程備份"""
|
||||
@ -1931,7 +2028,8 @@ class ChromaDBBackupUI:
|
||||
success = self.backup._run_scheduled_backup(
|
||||
job_id,
|
||||
job_info["description"],
|
||||
job_info["interval"]
|
||||
job_info["interval"],
|
||||
job_info.get("at_time") # 傳遞 at_time
|
||||
)
|
||||
self.root.after(0, lambda: self.finalize_job_execution(success))
|
||||
|
||||
@ -1971,7 +2069,7 @@ class ChromaDBBackupUI:
|
||||
).pack(anchor=W, pady=(0, 15))
|
||||
|
||||
# 創建表格
|
||||
columns = ("id", "interval", "description", "next_run", "keep_count")
|
||||
columns = ("id", "interval", "description", "next_run", "keep_count", "at_time") # 新增 at_time
|
||||
tree = ttk.Treeview(frame, columns=columns, show="headings", height=10)
|
||||
|
||||
tree.heading("id", text="任務ID")
|
||||
@ -1979,12 +2077,14 @@ class ChromaDBBackupUI:
|
||||
tree.heading("description", text="描述")
|
||||
tree.heading("next_run", text="下次執行")
|
||||
tree.heading("keep_count", text="保留數量")
|
||||
tree.heading("at_time", text="執行時間") # 新增
|
||||
|
||||
tree.column("id", width=150)
|
||||
tree.column("interval", width=80)
|
||||
tree.column("description", width=150)
|
||||
tree.column("next_run", width=150)
|
||||
tree.column("keep_count", width=80)
|
||||
tree.column("id", width=120)
|
||||
tree.column("interval", width=70)
|
||||
tree.column("description", width=120)
|
||||
tree.column("next_run", width=130)
|
||||
tree.column("keep_count", width=70)
|
||||
tree.column("at_time", width=70) # 新增
|
||||
|
||||
# 添加數據
|
||||
for job in jobs:
|
||||
@ -1995,7 +2095,8 @@ class ChromaDBBackupUI:
|
||||
job["interval"],
|
||||
job["description"],
|
||||
job["next_run"],
|
||||
job["keep_count"]
|
||||
job["keep_count"],
|
||||
job.get("at_time", "N/A") # 新增
|
||||
)
|
||||
)
|
||||
|
||||
|
||||
@ -3,6 +3,7 @@ import tkinter as tk
|
||||
from tkinter import filedialog, messagebox
|
||||
import json
|
||||
import chromadb
|
||||
from chromadb.utils import embedding_functions # 新增導入
|
||||
import datetime
|
||||
import pandas as pd
|
||||
import threading
|
||||
@ -15,6 +16,8 @@ from ttkbootstrap.scrolled import ScrolledFrame
|
||||
import numpy as np
|
||||
import logging
|
||||
from typing import List, Dict, Any, Optional, Union, Tuple
|
||||
import inspect # 用於檢查函數簽名,判斷是否支持混合搜索
|
||||
import re # 新增導入 for ID parsing in UI
|
||||
|
||||
class ChromaDBReader:
|
||||
"""ChromaDB備份讀取器的主數據模型"""
|
||||
@ -28,6 +31,9 @@ class ChromaDBReader:
|
||||
self.query_results = [] # 當前查詢結果
|
||||
self.chroma_client = None # ChromaDB客戶端
|
||||
|
||||
self.selected_embedding_model_name = "default" # 用於查詢的嵌入模型
|
||||
self.query_embedding_function = None # 實例化的查詢嵌入函數, None 表示使用集合內部預設
|
||||
|
||||
# 設置日誌
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
@ -119,12 +125,50 @@ class ChromaDBReader:
|
||||
self.collection_names = []
|
||||
return False
|
||||
|
||||
def set_query_embedding_model(self, model_name: str):
|
||||
"""設置查詢時使用的嵌入模型"""
|
||||
self.selected_embedding_model_name = model_name
|
||||
if model_name == "default":
|
||||
self.query_embedding_function = None # 表示使用集合的內部嵌入函數
|
||||
self.logger.info("查詢將使用集合內部嵌入模型。")
|
||||
elif model_name == "all-MiniLM-L6-v2":
|
||||
try:
|
||||
# 注意: sentence-transformers 庫需要安裝
|
||||
self.query_embedding_function = embedding_functions.SentenceTransformerEmbeddingFunction(model_name="all-MiniLM-L6-v2")
|
||||
self.logger.info(f"查詢將使用外部嵌入模型: {model_name}")
|
||||
except Exception as e:
|
||||
self.logger.error(f"無法加載 SentenceTransformer all-MiniLM-L6-v2: {e}。將使用集合內部模型。")
|
||||
self.query_embedding_function = None
|
||||
elif model_name == "paraphrase-multilingual-MiniLM-L12-v2":
|
||||
try:
|
||||
# 注意: sentence-transformers 庫需要安裝
|
||||
self.query_embedding_function = embedding_functions.SentenceTransformerEmbeddingFunction(model_name="paraphrase-multilingual-MiniLM-L12-v2")
|
||||
self.logger.info(f"查詢將使用外部嵌入模型: {model_name}")
|
||||
except Exception as e:
|
||||
self.logger.error(f"無法加載 SentenceTransformer paraphrase-multilingual-MiniLM-L12-v2: {e}。將使用集合內部模型。")
|
||||
self.query_embedding_function = None
|
||||
# 添加新的模型支持
|
||||
elif model_name == "paraphrase-multilingual-mpnet-base-v2":
|
||||
try:
|
||||
# 注意: sentence-transformers 庫需要安裝
|
||||
self.query_embedding_function = embedding_functions.SentenceTransformerEmbeddingFunction(model_name="sentence-transformers/paraphrase-multilingual-mpnet-base-v2")
|
||||
self.logger.info(f"查詢將使用外部嵌入模型: {model_name}")
|
||||
except Exception as e:
|
||||
self.logger.error(f"無法加載 SentenceTransformer paraphrase-multilingual-mpnet-base-v2: {e}。將使用集合內部模型。")
|
||||
self.query_embedding_function = None
|
||||
else:
|
||||
self.logger.warning(f"未知的查詢嵌入模型: {model_name}, 將使用集合內部模型。")
|
||||
self.query_embedding_function = None
|
||||
|
||||
def load_collection(self, collection_name: str) -> bool:
|
||||
"""加載指定的集合"""
|
||||
if not self.chroma_client or not collection_name:
|
||||
return False
|
||||
|
||||
try:
|
||||
# 獲取集合時,如果需要指定 embedding_function (通常在創建時指定)
|
||||
# 此處是讀取,所以集合的 embedding_function 已經固定
|
||||
# 我們將在查詢時使用 self.query_embedding_function 來生成 query_embeddings
|
||||
self.current_collection = self.chroma_client.get_collection(collection_name)
|
||||
self.logger.info(f"已加載集合: {collection_name}")
|
||||
return True
|
||||
@ -133,40 +177,156 @@ class ChromaDBReader:
|
||||
self.current_collection = None
|
||||
return False
|
||||
|
||||
def execute_query(self, query_text: str, n_results: int = 5) -> List[Dict]:
|
||||
"""執行查詢並返回結果"""
|
||||
def execute_query(self, query_text: str, n_results: int = 5,
|
||||
query_type: str = "basic",
|
||||
where: Dict = None,
|
||||
where_document: Dict = None,
|
||||
include: List[str] = None,
|
||||
metadata_filter: Dict = None,
|
||||
hybrid_alpha: float = None) -> List[Dict]:
|
||||
"""執行查詢並返回結果
|
||||
|
||||
參數:
|
||||
query_text: 查詢文本
|
||||
n_results: 返回結果數量
|
||||
query_type: 查詢類型 (basic, metadata, hybrid, multi_vector)
|
||||
where: where 過濾條件
|
||||
where_document: 文檔內容過濾條件
|
||||
include: 指定包含的文檔 ID
|
||||
metadata_filter: 元數據過濾條件
|
||||
hybrid_alpha: 混合搜索的權重參數(0-1之間,越大越傾向關鍵詞搜索)
|
||||
"""
|
||||
if not self.current_collection or not query_text:
|
||||
return []
|
||||
|
||||
try:
|
||||
results = self.current_collection.query(
|
||||
query_texts=[query_text],
|
||||
n_results=n_results
|
||||
)
|
||||
query_params = {
|
||||
"n_results": n_results
|
||||
}
|
||||
|
||||
# 轉換結果為更易用的格式
|
||||
# 基本查詢處理邏輯
|
||||
if query_type == "basic":
|
||||
query_params["query_texts"] = [query_text]
|
||||
# 多向量查詢(用於比較多個查詢之間的相似性)
|
||||
elif query_type == "multi_vector":
|
||||
# 支持以 "|||" 或換行符分隔的多個查詢文本
|
||||
if "|||" in query_text:
|
||||
query_texts = [text.strip() for text in query_text.split("|||")]
|
||||
else:
|
||||
query_texts = [text.strip() for text in query_text.splitlines() if text.strip()]
|
||||
query_params["query_texts"] = query_texts
|
||||
|
||||
# 添加其他查詢參數
|
||||
if where:
|
||||
query_params["where"] = where
|
||||
if where_document:
|
||||
query_params["where_document"] = where_document
|
||||
if include:
|
||||
query_params["include"] = include
|
||||
if metadata_filter:
|
||||
# 直接將元數據過濾條件轉換為 where 條件
|
||||
if "where" not in query_params:
|
||||
query_params["where"] = {}
|
||||
query_params["where"].update(metadata_filter)
|
||||
|
||||
# 混合搜索處理
|
||||
if query_type == "hybrid" and hybrid_alpha is not None:
|
||||
# 檢查 ChromaDB 版本是否支持混合搜索
|
||||
if hasattr(self.current_collection, "query") and "alpha" in inspect.signature(self.current_collection.query).parameters:
|
||||
query_params["alpha"] = hybrid_alpha
|
||||
# 混合搜索通常需要 query_texts
|
||||
if "query_texts" not in query_params:
|
||||
query_params["query_texts"] = [query_text]
|
||||
else:
|
||||
self.logger.warning("當前 ChromaDB 版本不支持混合搜索,將使用基本查詢")
|
||||
query_type = "basic" # 降級為基本查詢
|
||||
query_params["query_texts"] = [query_text]
|
||||
elif query_type == "hybrid" and hybrid_alpha is None:
|
||||
# 如果是混合搜索但未提供 alpha,則默認為基本搜索
|
||||
self.logger.warning("混合搜索未提供 Alpha 值,將使用基本查詢")
|
||||
query_type = "basic"
|
||||
query_params["query_texts"] = [query_text]
|
||||
|
||||
|
||||
# 如果 query_type 不是 multi_vector 且 query_texts 未設置,則設置
|
||||
if query_type not in ["multi_vector", "hybrid"] and "query_texts" not in query_params:
|
||||
query_params["query_texts"] = [query_text]
|
||||
|
||||
# 如果選擇了外部嵌入模型且不是混合查詢,則生成查詢嵌入
|
||||
if query_type != "hybrid" and \
|
||||
"query_texts" in query_params and \
|
||||
self.query_embedding_function:
|
||||
|
||||
texts_to_embed = query_params["query_texts"]
|
||||
try:
|
||||
# self.query_embedding_function 接受 List[str] 返回 List[List[float]]
|
||||
generated_embeddings = self.query_embedding_function(texts_to_embed)
|
||||
|
||||
if generated_embeddings and all(isinstance(emb, list) for emb in generated_embeddings):
|
||||
query_params["query_embeddings"] = generated_embeddings
|
||||
if "query_texts" in query_params: # 確保它存在才刪除
|
||||
del query_params["query_texts"]
|
||||
self.logger.info(f"使用 {self.selected_embedding_model_name} 生成了 {len(generated_embeddings)} 個查詢嵌入。")
|
||||
else:
|
||||
self.logger.warning(f"未能使用 {self.selected_embedding_model_name} 為所有查詢文本生成有效嵌入。將回退到使用集合預設嵌入函數進行文本查詢。嵌入結果: {generated_embeddings}")
|
||||
except Exception as e:
|
||||
self.logger.error(f"使用 {self.selected_embedding_model_name} 生成查詢嵌入時出錯: {e}。將回退到使用集合預設嵌入函數進行文本查詢。")
|
||||
|
||||
# 執行查詢
|
||||
results = self.current_collection.query(**query_params)
|
||||
|
||||
# 處理結果
|
||||
processed_results = []
|
||||
for i, (doc_id, document, metadata, distance) in enumerate(zip(
|
||||
results['ids'][0],
|
||||
results['documents'][0],
|
||||
results['metadatas'][0] if 'metadatas' in results and results['metadatas'][0] else [{}] * len(results['ids'][0]),
|
||||
results['distances'][0] if 'distances' in results else [0] * len(results['ids'][0])
|
||||
)):
|
||||
# 計算相似度分數 (將距離轉換為相似度: 1 - 歸一化距離)
|
||||
# 注意: 根據ChromaDB使用的距離度量可能需要調整
|
||||
similarity = 1.0 - min(distance, 1.0) # 確保值在0-1之間
|
||||
|
||||
processed_results.append({
|
||||
"rank": i + 1,
|
||||
"id": doc_id,
|
||||
"document": document,
|
||||
"metadata": metadata,
|
||||
"similarity": similarity,
|
||||
"distance": distance
|
||||
})
|
||||
# 獲取查詢返回的所有結果列表
|
||||
ids_list = results.get('ids', [[]])
|
||||
documents_list = results.get('documents', [[]])
|
||||
metadatas_list = results.get('metadatas', [[]])
|
||||
distances_list = results.get('distances', [[]])
|
||||
|
||||
# 確保列表長度一致,並為空列表提供默認值
|
||||
num_queries = len(ids_list)
|
||||
if not documents_list or len(documents_list) != num_queries:
|
||||
documents_list = [[] for _ in range(num_queries)]
|
||||
if not metadatas_list or len(metadatas_list) != num_queries:
|
||||
metadatas_list = [[{}] * len(ids_list[i]) for i in range(num_queries)]
|
||||
if not distances_list or len(distances_list) != num_queries:
|
||||
distances_list = [[0.0] * len(ids_list[i]) for i in range(num_queries)]
|
||||
|
||||
# 對於多查詢文本的情況,需要分別處理每個查詢的結果
|
||||
for query_idx, (ids, documents, metadatas, distances) in enumerate(zip(
|
||||
ids_list,
|
||||
documents_list,
|
||||
metadatas_list,
|
||||
distances_list
|
||||
)):
|
||||
# 處理每個查詢結果
|
||||
for i, (doc_id, document, metadata, distance) in enumerate(zip(
|
||||
ids, documents,
|
||||
metadatas if metadatas else [{}] * len(ids), # 再次確保元數據存在
|
||||
distances if distances else [0.0] * len(ids) # 再次確保距離存在
|
||||
)):
|
||||
# 計算相似度分數
|
||||
similarity = 1.0 - min(float(distance) if distance is not None else 1.0, 1.0)
|
||||
|
||||
result_item = {
|
||||
"rank": i + 1,
|
||||
"query_index": query_idx,
|
||||
"id": doc_id,
|
||||
"document": document,
|
||||
"metadata": metadata if metadata else {}, # 確保 metadata 是字典
|
||||
"similarity": similarity,
|
||||
"distance": float(distance) if distance is not None else 0.0,
|
||||
"query_type": query_type
|
||||
}
|
||||
|
||||
if query_type == "hybrid":
|
||||
result_item["hybrid_alpha"] = hybrid_alpha
|
||||
|
||||
processed_results.append(result_item)
|
||||
|
||||
self.query_results = processed_results
|
||||
self.logger.info(f"查詢完成,找到 {len(processed_results)} 個結果")
|
||||
self.logger.info(f"查詢完成,找到 {len(processed_results)} 個結果,查詢類型: {query_type}")
|
||||
return processed_results
|
||||
|
||||
except Exception as e:
|
||||
@ -174,6 +334,64 @@ class ChromaDBReader:
|
||||
self.query_results = []
|
||||
return []
|
||||
|
||||
def get_documents_by_ids(self, doc_ids: List[str]) -> List[Dict]:
|
||||
"""按文檔ID列表獲取文檔"""
|
||||
if not self.current_collection:
|
||||
self.logger.warning("沒有選擇集合,無法按 ID 獲取文檔。")
|
||||
return []
|
||||
if not doc_ids:
|
||||
self.logger.warning("未提供文檔 ID。")
|
||||
return []
|
||||
|
||||
try:
|
||||
results = self.current_collection.get(
|
||||
ids=doc_ids,
|
||||
include=["documents", "metadatas"]
|
||||
)
|
||||
|
||||
processed_results = []
|
||||
retrieved_ids = results.get('ids', [])
|
||||
retrieved_documents = results.get('documents', [])
|
||||
retrieved_metadatas = results.get('metadatas', [])
|
||||
|
||||
# 創建一個字典以便快速查找已檢索到的文檔信息
|
||||
found_docs_map = {}
|
||||
for i, r_id in enumerate(retrieved_ids):
|
||||
found_docs_map[r_id] = {
|
||||
"document": retrieved_documents[i] if i < len(retrieved_documents) else None,
|
||||
"metadata": retrieved_metadatas[i] if i < len(retrieved_metadatas) else {}
|
||||
}
|
||||
|
||||
rank_counter = 1
|
||||
for original_id in doc_ids: # 遍歷原始請求的ID,以保持某種順序感,並標記未找到的
|
||||
if original_id in found_docs_map:
|
||||
doc_data = found_docs_map[original_id]
|
||||
if doc_data["document"] is not None:
|
||||
processed_results.append({
|
||||
"rank": rank_counter,
|
||||
"id": original_id,
|
||||
"document": doc_data["document"],
|
||||
"metadata": doc_data["metadata"],
|
||||
"similarity": None, # Not applicable
|
||||
"distance": None, # Not applicable
|
||||
"query_type": "id_lookup"
|
||||
})
|
||||
rank_counter += 1
|
||||
else: # ID 存在但文檔為空(理論上不應發生在 get 中,除非 include 設置問題)
|
||||
self.logger.warning(f"ID {original_id} 找到但文檔內容為空。")
|
||||
# else: # ID 未在返回結果中找到,可以選擇不添加到 processed_results 或添加一個標記
|
||||
# self.logger.info(f"ID {original_id} 未在集合中找到。")
|
||||
|
||||
self.query_results = processed_results
|
||||
self.logger.info(f"按 ID 查詢完成,從請求的 {len(doc_ids)} 個ID中,實際找到 {len(processed_results)} 個文檔。")
|
||||
return processed_results
|
||||
|
||||
except Exception as e:
|
||||
self.logger.error(f"按 ID 獲取文檔時出錯: {str(e)}")
|
||||
# traceback.print_exc() # For debugging
|
||||
self.query_results = []
|
||||
return []
|
||||
|
||||
def get_collection_info(self, collection_name: str) -> Dict:
|
||||
"""獲取集合的詳細信息"""
|
||||
if not self.chroma_client:
|
||||
@ -235,6 +453,16 @@ class ChromaDBReaderUI:
|
||||
# 設置窗口
|
||||
self.root.title("ChromaDB 備份讀取器")
|
||||
self.root.geometry("1280x800")
|
||||
|
||||
# 初始化嵌入模型相關變量
|
||||
self.embedding_model_var = tk.StringVar(value="預設 (ChromaDB)") # 顯示名稱
|
||||
self.embedding_models = {
|
||||
"預設 (ChromaDB)": "default",
|
||||
"all-MiniLM-L6-v2 (ST)": "all-MiniLM-L6-v2",
|
||||
"paraphrase-multilingual-MiniLM-L12-v2 (ST)": "paraphrase-multilingual-MiniLM-L12-v2",
|
||||
"paraphrase-multilingual-mpnet-base-v2 (ST)": "paraphrase-multilingual-mpnet-base-v2" # 添加新的模型選項
|
||||
}
|
||||
|
||||
self.setup_ui()
|
||||
|
||||
# 默認主題
|
||||
@ -263,8 +491,12 @@ class ChromaDBReaderUI:
|
||||
self.right_panel = ttk.Frame(self.main_frame)
|
||||
self.right_panel.pack(side=LEFT, fill=BOTH, expand=YES)
|
||||
|
||||
# 設置狀態欄 (提前,以確保 self.status_var 在其他地方使用前已定義)
|
||||
self.setup_status_bar()
|
||||
|
||||
# 設置左側面板
|
||||
self.setup_directory_frame()
|
||||
self.setup_embedding_model_frame() # 新增嵌入模型選擇框架
|
||||
self.setup_backups_frame()
|
||||
self.setup_collections_frame()
|
||||
|
||||
@ -272,9 +504,6 @@ class ChromaDBReaderUI:
|
||||
self.setup_query_frame()
|
||||
self.setup_results_frame()
|
||||
|
||||
# 設置狀態欄
|
||||
self.setup_status_bar()
|
||||
|
||||
# 設置菜單
|
||||
self.setup_menu()
|
||||
|
||||
@ -315,6 +544,24 @@ class ChromaDBReaderUI:
|
||||
ttk.Button(dir_frame, text="瀏覽", command=self.browse_directory).pack(side=LEFT, padx=(5, 0))
|
||||
ttk.Button(dir_frame, text="載入", command=self.load_backups_directory).pack(side=LEFT, padx=(5, 0))
|
||||
|
||||
def setup_embedding_model_frame(self):
|
||||
"""設置查詢嵌入模型選擇框架"""
|
||||
embedding_frame = ttk.LabelFrame(self.left_panel, text="查詢嵌入模型", padding=10)
|
||||
embedding_frame.pack(fill=X, pady=(0, 10))
|
||||
|
||||
self.embedding_model_combo = ttk.Combobox(
|
||||
embedding_frame,
|
||||
textvariable=self.embedding_model_var,
|
||||
values=list(self.embedding_models.keys()),
|
||||
state="readonly"
|
||||
)
|
||||
self.embedding_model_combo.pack(fill=X, expand=YES)
|
||||
self.embedding_model_combo.set(list(self.embedding_models.keys())[0]) # 設置預設顯示值
|
||||
self.embedding_model_combo.bind("<<ComboboxSelected>>", self.on_embedding_model_changed)
|
||||
|
||||
# 初始化Reader中的嵌入模型選擇
|
||||
self.on_embedding_model_changed()
|
||||
|
||||
def setup_backups_frame(self):
|
||||
"""設置備份列表框架"""
|
||||
backups_frame = ttk.LabelFrame(self.left_panel, text="備份列表", padding=10)
|
||||
@ -388,12 +635,46 @@ class ChromaDBReaderUI:
|
||||
query_frame = ttk.LabelFrame(self.right_panel, text="查詢", padding=10)
|
||||
query_frame.pack(fill=X, pady=(0, 10))
|
||||
|
||||
# 查詢文本輸入
|
||||
ttk.Label(query_frame, text="查詢文本:").pack(anchor=W)
|
||||
self.query_text = tk.Text(query_frame, height=4, width=50)
|
||||
self.query_text.pack(fill=X, pady=5)
|
||||
# 創建一個 Notebook 以包含不同的查詢類型標籤頁
|
||||
self.query_notebook = ttk.Notebook(query_frame)
|
||||
self.query_notebook.pack(fill=X, pady=5)
|
||||
|
||||
# 查詢參數
|
||||
# 基本查詢標籤頁
|
||||
self.basic_query_frame = ttk.Frame(self.query_notebook)
|
||||
self.query_notebook.add(self.basic_query_frame, text="基本查詢")
|
||||
|
||||
# 元數據查詢標籤頁
|
||||
self.metadata_query_frame = ttk.Frame(self.query_notebook)
|
||||
self.query_notebook.add(self.metadata_query_frame, text="元數據查詢")
|
||||
|
||||
# 混合查詢標籤頁
|
||||
self.hybrid_query_frame = ttk.Frame(self.query_notebook)
|
||||
self.query_notebook.add(self.hybrid_query_frame, text="混合查詢")
|
||||
|
||||
# 多向量查詢標籤頁
|
||||
self.multi_vector_frame = ttk.Frame(self.query_notebook)
|
||||
self.query_notebook.add(self.multi_vector_frame, text="多向量查詢")
|
||||
|
||||
# ID 查詢標籤頁 (新增)
|
||||
self.id_query_frame = ttk.Frame(self.query_notebook)
|
||||
self.query_notebook.add(self.id_query_frame, text="ID 查詢")
|
||||
|
||||
# 設置基本查詢頁面
|
||||
self.setup_basic_query_tab()
|
||||
|
||||
# 設置元數據查詢頁面
|
||||
self.setup_metadata_query_tab()
|
||||
|
||||
# 設置混合查詢頁面
|
||||
self.setup_hybrid_query_tab()
|
||||
|
||||
# 設置多向量查詢頁面
|
||||
self.setup_multi_vector_tab()
|
||||
|
||||
# 設置 ID 查詢頁面 (新增)
|
||||
self.setup_id_query_tab()
|
||||
|
||||
# 查詢參數(共用部分)
|
||||
params_frame = ttk.Frame(query_frame)
|
||||
params_frame.pack(fill=X)
|
||||
|
||||
@ -405,10 +686,103 @@ class ChromaDBReaderUI:
|
||||
ttk.Button(
|
||||
query_frame,
|
||||
text="執行查詢",
|
||||
command=self.execute_query,
|
||||
command=self.execute_query, # 注意:這個 execute_query 方法將被新的替換
|
||||
style="Accent.TButton"
|
||||
).pack(pady=10)
|
||||
|
||||
def setup_basic_query_tab(self):
|
||||
"""設置基本查詢標籤頁"""
|
||||
ttk.Label(self.basic_query_frame, text="查詢文本:").pack(anchor=W)
|
||||
self.basic_query_text = tk.Text(self.basic_query_frame, height=4, width=50)
|
||||
self.basic_query_text.pack(fill=X, pady=5)
|
||||
|
||||
def setup_metadata_query_tab(self):
|
||||
"""設置元數據查詢標籤頁"""
|
||||
ttk.Label(self.metadata_query_frame, text="查詢文本:").pack(anchor=W)
|
||||
self.metadata_query_text = tk.Text(self.metadata_query_frame, height=4, width=50)
|
||||
self.metadata_query_text.pack(fill=X, pady=5)
|
||||
|
||||
ttk.Label(self.metadata_query_frame, text="元數據過濾條件 (JSON 格式):").pack(anchor=W)
|
||||
self.metadata_filter_text = tk.Text(self.metadata_query_frame, height=4, width=50)
|
||||
self.metadata_filter_text.pack(fill=X, pady=5)
|
||||
self.metadata_filter_text.insert("1.0", '{"key": "value"}')
|
||||
|
||||
# 添加一個幫助按鈕,顯示元數據過濾語法的說明
|
||||
ttk.Button(
|
||||
self.metadata_query_frame,
|
||||
text="?",
|
||||
width=2,
|
||||
command=self.show_metadata_help
|
||||
).pack(anchor=E)
|
||||
|
||||
def setup_hybrid_query_tab(self):
|
||||
"""設置混合查詢標籤頁"""
|
||||
ttk.Label(self.hybrid_query_frame, text="查詢文本:").pack(anchor=W)
|
||||
self.hybrid_query_text = tk.Text(self.hybrid_query_frame, height=4, width=50)
|
||||
self.hybrid_query_text.pack(fill=X, pady=5)
|
||||
|
||||
alpha_frame = ttk.Frame(self.hybrid_query_frame)
|
||||
alpha_frame.pack(fill=X)
|
||||
|
||||
ttk.Label(alpha_frame, text="Alpha 值 (0-1):").pack(side=LEFT)
|
||||
self.hybrid_alpha_var = tk.DoubleVar(value=0.5)
|
||||
ttk.Scale(
|
||||
alpha_frame,
|
||||
from_=0.0, to=1.0,
|
||||
variable=self.hybrid_alpha_var,
|
||||
orient=tk.HORIZONTAL,
|
||||
length=200
|
||||
).pack(side=LEFT, padx=5, fill=X, expand=YES)
|
||||
|
||||
# 創建一個Label來顯示Scale的當前值
|
||||
self.hybrid_alpha_label = ttk.Label(alpha_frame, text=f"{self.hybrid_alpha_var.get():.2f}")
|
||||
self.hybrid_alpha_label.pack(side=LEFT)
|
||||
# 綁定Scale的變動到更新Label的函數
|
||||
self.hybrid_alpha_var.trace_add("write", lambda *args: self.hybrid_alpha_label.config(text=f"{self.hybrid_alpha_var.get():.2f}"))
|
||||
|
||||
ttk.Label(self.hybrid_query_frame, text="注意: Alpha=0 完全使用向量搜索,Alpha=1 完全使用關鍵詞搜索").pack(pady=2)
|
||||
ttk.Label(self.hybrid_query_frame, text="混合查詢將使用集合原始嵌入模型,忽略上方選擇的查詢嵌入模型。", font=("TkDefaultFont", 8)).pack(pady=2)
|
||||
|
||||
|
||||
def setup_multi_vector_tab(self):
|
||||
"""設置多向量查詢標籤頁"""
|
||||
ttk.Label(self.multi_vector_frame, text="多個查詢文本 (每行一個,或使用 ||| 分隔):").pack(anchor=W)
|
||||
self.multi_vector_text = tk.Text(self.multi_vector_frame, height=6, width=50)
|
||||
self.multi_vector_text.pack(fill=X, pady=5)
|
||||
self.multi_vector_text.insert("1.0", "查詢文本 1\n|||查詢文本 2\n|||查詢文本 3")
|
||||
|
||||
ttk.Label(self.multi_vector_frame, text="用於比較多個查詢之間的相似性").pack(pady=5)
|
||||
|
||||
def setup_id_query_tab(self):
|
||||
"""設置ID查詢標籤頁"""
|
||||
ttk.Label(self.id_query_frame, text="文檔 ID (每行一個,或用逗號/空格分隔):").pack(anchor=tk.W)
|
||||
self.id_query_text = tk.Text(self.id_query_frame, height=6, width=50)
|
||||
self.id_query_text.pack(fill=tk.X, pady=5)
|
||||
self.id_query_text.insert("1.0", "id1\nid2,id3 id4") # 示例
|
||||
ttk.Label(self.id_query_frame, text="此查詢將獲取指定ID的文檔,忽略上方“結果數量”設置。").pack(pady=5)
|
||||
|
||||
|
||||
def show_metadata_help(self):
|
||||
"""顯示元數據過濾語法說明"""
|
||||
help_text = """元數據過濾語法示例:
|
||||
|
||||
基本過濾:
|
||||
{"category": "文章"} # 精確匹配
|
||||
|
||||
範圍過濾:
|
||||
{"date": {"$gt": "2023-01-01"}} # 大於
|
||||
{"date": {"$lt": "2023-12-31"}} # 小於
|
||||
{"count": {"$gte": 10}} # 大於等於
|
||||
{"count": {"$lte": 100}} # 小於等於
|
||||
|
||||
多條件過濾:
|
||||
{"$and": [{"category": "文章"}, {"author": "張三"}]} # AND 條件
|
||||
{"$or": [{"category": "文章"}, {"category": "新聞"}]} # OR 條件
|
||||
|
||||
注意: 此處語法遵循 ChromaDB 的過濾語法,非標準 JSON 查詢語法。
|
||||
"""
|
||||
messagebox.showinfo("元數據過濾語法說明", help_text)
|
||||
|
||||
def setup_results_frame(self):
|
||||
"""設置結果顯示框架"""
|
||||
self.results_notebook = ttk.Notebook(self.right_panel)
|
||||
@ -443,6 +817,26 @@ class ChromaDBReaderUI:
|
||||
status_label = ttk.Label(status_frame, textvariable=self.status_var, relief=tk.SUNKEN, anchor=W)
|
||||
status_label.pack(fill=X)
|
||||
|
||||
def on_embedding_model_changed(self, event=None):
|
||||
"""處理查詢嵌入模型選擇變更事件"""
|
||||
selected_display_name = self.embedding_model_var.get()
|
||||
model_name_key = self.embedding_models.get(selected_display_name, "default")
|
||||
|
||||
if hasattr(self, 'reader') and self.reader:
|
||||
self.reader.set_query_embedding_model(model_name_key) # 更新Reader中的模型
|
||||
|
||||
# 更新狀態欄提示
|
||||
if model_name_key == "default":
|
||||
self.status_var.set("查詢將使用集合內部嵌入模型。")
|
||||
elif self.reader.query_embedding_function: # 檢查模型是否成功加載
|
||||
self.status_var.set(f"查詢將使用外部模型: {selected_display_name}")
|
||||
else: # 加載失敗
|
||||
self.status_var.set(f"模型 {selected_display_name} 加載失敗/無效,將使用集合內部模型。")
|
||||
else:
|
||||
# Reader尚未初始化,這通常在UI初始化早期發生
|
||||
# self.reader.set_query_embedding_model 會在 setup_embedding_model_frame 中首次調用時處理
|
||||
pass
|
||||
|
||||
def browse_directory(self):
|
||||
"""瀏覽選擇備份目錄"""
|
||||
directory = filedialog.askdirectory(
|
||||
@ -527,27 +921,38 @@ class ChromaDBReaderUI:
|
||||
|
||||
# 獲取選定項的索引
|
||||
item_id = selection[0]
|
||||
item_index = self.backups_tree.index(item_id)
|
||||
# item_index = self.backups_tree.index(item_id) # 這個索引是相對於當前顯示的項目的
|
||||
|
||||
# 獲取所有顯示的備份項目
|
||||
visible_items = self.backups_tree.get_children()
|
||||
if item_index >= len(visible_items):
|
||||
# 直接從 Treeview item 中獲取備份名稱,然後在 self.reader.backups 中查找
|
||||
try:
|
||||
backup_name_from_tree = self.backups_tree.item(item_id)["values"][0]
|
||||
except IndexError:
|
||||
self.logger.error("無法從 Treeview 獲取備份名稱")
|
||||
return
|
||||
|
||||
# 查找此顯示項對應的實際備份索引
|
||||
backup_name = self.backups_tree.item(visible_items[item_index])["values"][0]
|
||||
backup_index = next((i for i, b in enumerate(self.reader.backups) if b["name"] == backup_name), -1)
|
||||
actual_backup_index = -1
|
||||
for i, backup_info in enumerate(self.reader.backups):
|
||||
if backup_info["name"] == backup_name_from_tree:
|
||||
actual_backup_index = i
|
||||
break
|
||||
|
||||
if backup_index == -1:
|
||||
if actual_backup_index == -1:
|
||||
self.logger.error(f"在備份列表中未找到名為 {backup_name_from_tree} 的備份")
|
||||
return
|
||||
|
||||
# 載入備份
|
||||
self.status_var.set(f"正在載入備份: {backup_name}...")
|
||||
self.status_var.set(f"正在載入備份: {backup_name_from_tree}...")
|
||||
self.root.update_idletasks()
|
||||
|
||||
# 確保 Reader 中的嵌入模型是最新的 (雖然 on_embedding_model_changed 應該已經處理了)
|
||||
# selected_display_name = self.embedding_model_var.get()
|
||||
# model_key = self.embedding_models.get(selected_display_name, "default")
|
||||
# self.reader.set_query_embedding_model(model_key) # 這行不需要,因為模型選擇是獨立的
|
||||
|
||||
def load_backup_thread():
|
||||
success = self.reader.load_backup(backup_index)
|
||||
self.root.after(0, lambda: self.finalize_backup_loading(success, backup_name))
|
||||
# load_backup 不再需要 embedding_model_name 參數,因為嵌入模型選擇是針對查詢的
|
||||
success = self.reader.load_backup(actual_backup_index)
|
||||
self.root.after(0, lambda: self.finalize_backup_loading(success, backup_name_from_tree))
|
||||
|
||||
threading.Thread(target=load_backup_thread).start()
|
||||
|
||||
@ -618,7 +1023,7 @@ class ChromaDBReaderUI:
|
||||
# 獲取集合詳細信息並顯示
|
||||
info = self.reader.get_collection_info(collection_name)
|
||||
info_text = f"集合: {info['name']}\n文檔數: {info['document_count']}\n向量維度: {info['dimension']}"
|
||||
messagebox.showinfo("集合信息", info_text)
|
||||
# messagebox.showinfo("集合信息", info_text) # 暫時註解掉,避免每次選集合都彈窗
|
||||
else:
|
||||
self.status_var.set(f"載入集合失敗: {collection_name}")
|
||||
messagebox.showerror("錯誤", f"無法載入集合: {collection_name}")
|
||||
@ -629,25 +1034,170 @@ class ChromaDBReaderUI:
|
||||
messagebox.showinfo("提示", "請先選擇一個集合")
|
||||
return
|
||||
|
||||
query_text = self.query_text.get("1.0", tk.END).strip()
|
||||
if not query_text:
|
||||
messagebox.showinfo("提示", "請輸入查詢文本")
|
||||
return
|
||||
# 根據當前選擇的標籤頁確定查詢類型
|
||||
try:
|
||||
current_tab_widget = self.query_notebook.nametowidget(self.query_notebook.select())
|
||||
if current_tab_widget == self.basic_query_frame:
|
||||
current_tab = 0
|
||||
elif current_tab_widget == self.metadata_query_frame:
|
||||
current_tab = 1
|
||||
elif current_tab_widget == self.hybrid_query_frame:
|
||||
current_tab = 2
|
||||
elif current_tab_widget == self.multi_vector_frame:
|
||||
current_tab = 3
|
||||
elif current_tab_widget == self.id_query_frame: # 新增 ID 查詢頁判斷
|
||||
current_tab = 4
|
||||
else:
|
||||
messagebox.showerror("錯誤", "未知的查詢標籤頁")
|
||||
return
|
||||
except tk.TclError: # Notebook可能還沒有任何分頁被選中
|
||||
messagebox.showerror("錯誤", "請選擇一個查詢類型標籤頁")
|
||||
return
|
||||
|
||||
# 獲取查詢參數
|
||||
try:
|
||||
n_results = int(self.n_results_var.get())
|
||||
except ValueError:
|
||||
messagebox.showerror("錯誤", "結果數量必須是整數")
|
||||
return
|
||||
|
||||
self.status_var.set("正在執行查詢...")
|
||||
self.root.update_idletasks()
|
||||
# 執行不同類型的查詢
|
||||
if current_tab == 0: # 基本查詢
|
||||
query_text = self.basic_query_text.get("1.0", tk.END).strip()
|
||||
if not query_text:
|
||||
messagebox.showinfo("提示", "請輸入查詢文本")
|
||||
return
|
||||
|
||||
self.status_var.set("正在執行基本查詢...")
|
||||
self.execute_basic_query(query_text, n_results)
|
||||
|
||||
elif current_tab == 1: # 元數據查詢
|
||||
query_text = self.metadata_query_text.get("1.0", tk.END).strip()
|
||||
metadata_filter_text = self.metadata_filter_text.get("1.0", tk.END).strip()
|
||||
|
||||
if not query_text: # 元數據查詢的文本也可以是空的,如果只想用metadata_filter
|
||||
# messagebox.showinfo("提示", "請輸入查詢文本")
|
||||
# return
|
||||
pass # 允許空查詢文本
|
||||
|
||||
try:
|
||||
metadata_filter = json.loads(metadata_filter_text) if metadata_filter_text else None
|
||||
except json.JSONDecodeError:
|
||||
messagebox.showerror("錯誤", "元數據過濾條件必須是有效的 JSON 格式")
|
||||
return
|
||||
|
||||
if not query_text and not metadata_filter:
|
||||
messagebox.showinfo("提示", "請輸入查詢文本或元數據過濾條件")
|
||||
return
|
||||
|
||||
self.status_var.set("正在執行元數據查詢...")
|
||||
self.execute_metadata_query(query_text, n_results, metadata_filter)
|
||||
|
||||
elif current_tab == 2: # 混合查詢
|
||||
query_text = self.hybrid_query_text.get("1.0", tk.END).strip()
|
||||
hybrid_alpha = self.hybrid_alpha_var.get()
|
||||
|
||||
if not query_text:
|
||||
messagebox.showinfo("提示", "請輸入查詢文本")
|
||||
return
|
||||
|
||||
self.status_var.set("正在執行混合查詢...")
|
||||
self.execute_hybrid_query(query_text, n_results, hybrid_alpha)
|
||||
|
||||
elif current_tab == 3: # 多向量查詢
|
||||
query_text = self.multi_vector_text.get("1.0", tk.END).strip()
|
||||
|
||||
if not query_text:
|
||||
messagebox.showinfo("提示", "請輸入查詢文本")
|
||||
return
|
||||
|
||||
self.status_var.set("正在執行多向量查詢...")
|
||||
self.execute_multi_vector_query(query_text, n_results)
|
||||
|
||||
elif current_tab == 4: # ID 查詢
|
||||
id_input_str = self.id_query_text.get("1.0", tk.END).strip()
|
||||
if not id_input_str:
|
||||
messagebox.showinfo("提示", "請輸入文檔 ID。")
|
||||
return
|
||||
|
||||
# 解析 ID: 支持逗號、空格、換行符分隔
|
||||
doc_ids = [id_val.strip() for id_val in re.split(r'[,\s\n]+', id_input_str) if id_val.strip()]
|
||||
|
||||
if not doc_ids:
|
||||
messagebox.showinfo("提示", "未解析到有效的文檔 ID。")
|
||||
return
|
||||
|
||||
self.status_var.set("正在按 ID 獲取文檔...")
|
||||
self.execute_id_lookup_query(doc_ids)
|
||||
|
||||
|
||||
def execute_basic_query(self, query_text, n_results):
|
||||
"""執行基本查詢"""
|
||||
self.status_var.set(f"正在執行基本查詢: {query_text[:30]}...")
|
||||
self.root.update_idletasks()
|
||||
def query_thread():
|
||||
results = self.reader.execute_query(query_text, n_results)
|
||||
results = self.reader.execute_query(
|
||||
query_text=query_text,
|
||||
n_results=n_results,
|
||||
query_type="basic"
|
||||
)
|
||||
self.root.after(0, lambda: self.display_results(results))
|
||||
|
||||
threading.Thread(target=query_thread).start()
|
||||
threading.Thread(target=query_thread, daemon=True).start()
|
||||
|
||||
def execute_metadata_query(self, query_text, n_results, metadata_filter):
|
||||
"""執行元數據查詢"""
|
||||
self.status_var.set(f"正在執行元數據查詢: {query_text[:30]}...")
|
||||
self.root.update_idletasks()
|
||||
def query_thread():
|
||||
results = self.reader.execute_query(
|
||||
query_text=query_text,
|
||||
n_results=n_results,
|
||||
query_type="metadata", # 這裡應該是 "metadata" 但後端邏輯會轉為 where
|
||||
metadata_filter=metadata_filter
|
||||
)
|
||||
self.root.after(0, lambda: self.display_results(results))
|
||||
|
||||
threading.Thread(target=query_thread, daemon=True).start()
|
||||
|
||||
def execute_hybrid_query(self, query_text, n_results, hybrid_alpha):
|
||||
"""執行混合查詢"""
|
||||
self.status_var.set(f"正在執行混合查詢 (α={hybrid_alpha:.2f}): {query_text[:30]}...")
|
||||
self.root.update_idletasks()
|
||||
def query_thread():
|
||||
results = self.reader.execute_query(
|
||||
query_text=query_text,
|
||||
n_results=n_results,
|
||||
query_type="hybrid",
|
||||
hybrid_alpha=hybrid_alpha
|
||||
)
|
||||
self.root.after(0, lambda: self.display_results(results))
|
||||
|
||||
threading.Thread(target=query_thread, daemon=True).start()
|
||||
|
||||
def execute_multi_vector_query(self, query_text, n_results):
|
||||
"""執行多向量查詢"""
|
||||
self.status_var.set(f"正在執行多向量查詢: {query_text.splitlines()[0][:30] if query_text.splitlines() else ''}...")
|
||||
self.root.update_idletasks()
|
||||
def query_thread():
|
||||
results = self.reader.execute_query(
|
||||
query_text=query_text,
|
||||
n_results=n_results,
|
||||
query_type="multi_vector"
|
||||
)
|
||||
self.root.after(0, lambda: self.display_results(results))
|
||||
|
||||
threading.Thread(target=query_thread, daemon=True).start()
|
||||
|
||||
def execute_id_lookup_query(self, doc_ids: List[str]):
|
||||
"""執行ID查找查詢"""
|
||||
self.status_var.set(f"正在按 ID 獲取 {len(doc_ids)} 個文檔...")
|
||||
self.root.update_idletasks()
|
||||
def query_thread():
|
||||
results = self.reader.get_documents_by_ids(doc_ids)
|
||||
self.root.after(0, lambda: self.display_results(results))
|
||||
|
||||
threading.Thread(target=query_thread, daemon=True).start()
|
||||
|
||||
def display_results(self, results):
|
||||
"""顯示查詢結果"""
|
||||
@ -679,27 +1229,49 @@ class ChromaDBReaderUI:
|
||||
widget.destroy()
|
||||
|
||||
# 創建表格
|
||||
columns = ("rank", "similarity", "id", "document")
|
||||
columns = ("rank", "similarity", "query_type", "id", "document")
|
||||
tree = ttk.Treeview(self.list_view, columns=columns, show="headings")
|
||||
tree.heading("rank", text="#")
|
||||
tree.heading("similarity", text="相似度")
|
||||
tree.heading("query_type", text="查詢類型")
|
||||
tree.heading("id", text="文檔ID")
|
||||
tree.heading("document", text="文檔內容")
|
||||
|
||||
tree.column("rank", width=50, anchor=CENTER)
|
||||
tree.column("similarity", width=100, anchor=CENTER)
|
||||
tree.column("id", width=200)
|
||||
tree.column("document", width=600)
|
||||
tree.column("query_type", width=120, anchor=CENTER) # 調整寬度以適應更長的類型名稱
|
||||
tree.column("id", width=150)
|
||||
tree.column("document", width=530) # 調整寬度
|
||||
|
||||
# 確定查詢類型名稱映射
|
||||
query_type_names = {
|
||||
"basic": "基本查詢",
|
||||
"metadata": "元數據查詢",
|
||||
"hybrid": "混合查詢",
|
||||
"multi_vector": "多向量查詢",
|
||||
"id_lookup": "ID 查詢" # 新增
|
||||
}
|
||||
|
||||
# 添加結果到表格
|
||||
for result in results:
|
||||
raw_query_type = result.get("query_type", "basic")
|
||||
display_query_type = query_type_names.get(raw_query_type, raw_query_type.capitalize())
|
||||
|
||||
if raw_query_type == "hybrid" and "hybrid_alpha" in result:
|
||||
display_query_type += f" (α={result['hybrid_alpha']:.2f})"
|
||||
if raw_query_type == "multi_vector" and "query_index" in result:
|
||||
display_query_type += f" (Q{result['query_index']+1})"
|
||||
|
||||
similarity_display = f"{result.get('similarity', 0.0):.4f}" if result.get('similarity') is not None else "N/A"
|
||||
|
||||
tree.insert(
|
||||
"", "end",
|
||||
values=(
|
||||
result["rank"],
|
||||
f"{result['similarity']:.4f}",
|
||||
result["id"],
|
||||
result["document"][:100] + ("..." if len(result["document"]) > 100 else "")
|
||||
result.get("rank", "-"),
|
||||
similarity_display,
|
||||
display_query_type,
|
||||
result.get("id", "N/A"),
|
||||
result.get("document", "")[:100] + ("..." if len(result.get("document", "")) > 100 else "")
|
||||
)
|
||||
)
|
||||
|
||||
@ -710,7 +1282,6 @@ class ChromaDBReaderUI:
|
||||
# 雙擊項目顯示完整內容
|
||||
tree.bind("<Double-1>", lambda event: self.show_full_document(tree))
|
||||
|
||||
# 使用 Frame 容器來實現滾動功能
|
||||
# 佈局
|
||||
tree.pack(side=LEFT, fill=BOTH, expand=YES)
|
||||
scrollbar.pack(side=RIGHT, fill=Y)
|
||||
@ -739,7 +1310,10 @@ class ChromaDBReaderUI:
|
||||
|
||||
# 添加文檔信息
|
||||
info_text = f"文檔ID: {result['id']}\n"
|
||||
info_text += f"相似度: {result['similarity']:.4f}\n"
|
||||
if result.get('similarity') is not None:
|
||||
info_text += f"相似度: {result['similarity']:.4f}\n"
|
||||
else:
|
||||
info_text += "相似度: N/A\n"
|
||||
|
||||
if result['metadata']:
|
||||
info_text += "\n元數據:\n"
|
||||
@ -806,9 +1380,10 @@ class ChromaDBReaderUI:
|
||||
title_frame = ttk.Frame(card)
|
||||
title_frame.pack(fill=X)
|
||||
|
||||
similarity_text_detail = f"{result['similarity']:.4f}" if result.get('similarity') is not None else "N/A"
|
||||
ttk.Label(
|
||||
title_frame,
|
||||
text=f"#{result['rank']} - 相似度: {result['similarity']:.4f}",
|
||||
text=f"#{result['rank']} - 相似度: {similarity_text_detail}",
|
||||
font=("TkDefaultFont", 10, "bold")
|
||||
).pack(side=LEFT)
|
||||
|
||||
@ -881,7 +1456,10 @@ class ChromaDBReaderUI:
|
||||
|
||||
# 添加文檔信息
|
||||
info_text = f"文檔ID: {result['id']}\n"
|
||||
info_text += f"相似度: {result['similarity']:.4f}\n"
|
||||
if result.get('similarity') is not None:
|
||||
info_text += f"相似度: {result['similarity']:.4f}\n"
|
||||
else:
|
||||
info_text += "相似度: N/A\n"
|
||||
|
||||
if result['metadata']:
|
||||
info_text += "\n元數據:\n"
|
||||
|
||||
147
tools/color_picker.py
Normal file
147
tools/color_picker.py
Normal file
@ -0,0 +1,147 @@
|
||||
import cv2
|
||||
import numpy as np
|
||||
import pyautogui
|
||||
|
||||
def pick_color_fixed():
|
||||
# 截取游戏区域
|
||||
screenshot = pyautogui.screenshot(region=(150, 330, 600, 880))
|
||||
img = np.array(screenshot)
|
||||
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
|
||||
|
||||
# 转为HSV
|
||||
hsv_img = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
|
||||
|
||||
# 创建窗口和滑块
|
||||
cv2.namedWindow('Color Picker')
|
||||
|
||||
# 存储采样点
|
||||
sample_points = []
|
||||
|
||||
# 定义鼠标回调函数
|
||||
def mouse_callback(event, x, y, flags, param):
|
||||
if event == cv2.EVENT_LBUTTONDOWN:
|
||||
# 获取点击位置的HSV值
|
||||
hsv_value = hsv_img[y, x]
|
||||
sample_points.append(hsv_value)
|
||||
print(f"添加采样点 #{len(sample_points)}: HSV = {hsv_value}")
|
||||
|
||||
# 在图像上显示采样点
|
||||
cv2.circle(img, (x, y), 3, (0, 255, 0), -1)
|
||||
cv2.imshow('Color Picker', img)
|
||||
|
||||
# 如果有足够多的采样点,计算更精确的范围
|
||||
if len(sample_points) >= 1:
|
||||
calculate_range()
|
||||
|
||||
def calculate_range():
|
||||
"""安全计算HSV范围,避免溢出"""
|
||||
if not sample_points:
|
||||
return
|
||||
|
||||
# 转换为numpy数组
|
||||
points_array = np.array(sample_points)
|
||||
|
||||
# 提取各通道的值并安全计算范围
|
||||
h_values = points_array[:, 0].astype(np.int32) # 转为int32避免溢出
|
||||
s_values = points_array[:, 1].astype(np.int32)
|
||||
v_values = points_array[:, 2].astype(np.int32)
|
||||
|
||||
# 检查H值是否跨越边界
|
||||
h_range = np.max(h_values) - np.min(h_values)
|
||||
h_crosses_boundary = h_range > 90 and len(h_values) > 2
|
||||
|
||||
# 计算安全范围值
|
||||
if h_crosses_boundary:
|
||||
print("检测到H值可能跨越红色边界(0/180)!")
|
||||
# 特殊处理跨越边界的H值
|
||||
# 方法1: 简单方式 - 使用宽范围
|
||||
h_min = 0
|
||||
h_max = 179
|
||||
print(f"使用全H范围: [{h_min}, {h_max}]")
|
||||
else:
|
||||
# 正常计算H范围
|
||||
h_min = max(0, np.min(h_values) - 5)
|
||||
h_max = min(179, np.max(h_values) + 5)
|
||||
|
||||
# 安全计算S和V范围
|
||||
s_min = max(0, np.min(s_values) - 15)
|
||||
s_max = min(255, np.max(s_values) + 15)
|
||||
v_min = max(0, np.min(v_values) - 15)
|
||||
v_max = min(255, np.max(v_values) + 15)
|
||||
|
||||
print("\n推荐的HSV范围:")
|
||||
print(f"\"hsv_lower\": [{h_min}, {s_min}, {v_min}],")
|
||||
print(f"\"hsv_upper\": [{h_max}, {s_max}, {v_max}],")
|
||||
|
||||
# 显示掩码预览
|
||||
show_mask_preview(h_min, h_max, s_min, s_max, v_min, v_max)
|
||||
|
||||
def show_mask_preview(h_min, h_max, s_min, s_max, v_min, v_max):
|
||||
"""显示掩码预览,标记检测到的区域"""
|
||||
|
||||
# 创建掩码
|
||||
if h_min <= h_max:
|
||||
# 标准范围
|
||||
mask = cv2.inRange(hsv_img,
|
||||
np.array([h_min, s_min, v_min]),
|
||||
np.array([h_max, s_max, v_max]))
|
||||
else:
|
||||
# 处理H值跨越边界情况
|
||||
mask1 = cv2.inRange(hsv_img,
|
||||
np.array([h_min, s_min, v_min]),
|
||||
np.array([179, s_max, v_max]))
|
||||
mask2 = cv2.inRange(hsv_img,
|
||||
np.array([0, s_min, v_min]),
|
||||
np.array([h_max, s_max, v_max]))
|
||||
mask = cv2.bitwise_or(mask1, mask2)
|
||||
|
||||
# 形态学操作 - 闭运算连接临近区域
|
||||
kernel = np.ones((5, 5), np.uint8)
|
||||
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
|
||||
|
||||
# 找到连通区域
|
||||
num_labels, labels, stats, centroids = cv2.connectedComponentsWithStats(mask)
|
||||
|
||||
# 创建结果图像
|
||||
result_img = img.copy()
|
||||
detected_count = 0
|
||||
|
||||
# 处理每个连通区域
|
||||
for i in range(1, num_labels): # 跳过背景(0)
|
||||
area = stats[i, cv2.CC_STAT_AREA]
|
||||
# 面积筛选
|
||||
if 3000 <= area <= 100000:
|
||||
detected_count += 1
|
||||
x = stats[i, cv2.CC_STAT_LEFT]
|
||||
y = stats[i, cv2.CC_STAT_TOP]
|
||||
w = stats[i, cv2.CC_STAT_WIDTH]
|
||||
h = stats[i, cv2.CC_STAT_HEIGHT]
|
||||
|
||||
# 绘制区域边框
|
||||
cv2.rectangle(result_img, (x, y), (x+w, y+h), (0, 255, 0), 2)
|
||||
# 显示区域ID
|
||||
cv2.putText(result_img, f"#{i}", (x+5, y+20),
|
||||
cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
|
||||
|
||||
# 显示结果
|
||||
cv2.imshow('Mask Preview', result_img)
|
||||
print(f"检测到 {detected_count} 个合适大小的区域")
|
||||
|
||||
# 设置鼠标回调
|
||||
cv2.setMouseCallback('Color Picker', mouse_callback)
|
||||
|
||||
# 显示操作说明
|
||||
print("使用说明:")
|
||||
print("1. 点击气泡上的多个位置进行采样")
|
||||
print("2. 程序会自动计算合适的HSV范围")
|
||||
print("3. 绿色方框表示检测到的区域")
|
||||
print("4. 按ESC键退出")
|
||||
print("\n【特别提示】如果气泡混合了红色和紫色,可能需要创建两个配置以处理H通道的边界问题")
|
||||
|
||||
# 显示图像
|
||||
cv2.imshow('Color Picker', img)
|
||||
cv2.waitKey(0)
|
||||
cv2.destroyAllWindows()
|
||||
|
||||
if __name__ == "__main__":
|
||||
pick_color_fixed()
|
||||
@ -4,6 +4,8 @@
|
||||
import pyautogui
|
||||
import cv2 # opencv-python
|
||||
import numpy as np
|
||||
import sys # Added for special character handling
|
||||
import io # Added for special character handling
|
||||
import pyperclip
|
||||
import time
|
||||
import os
|
||||
@ -16,12 +18,107 @@ import queue
|
||||
from typing import List, Tuple, Optional, Dict, Any
|
||||
import threading # Import threading for Lock if needed, or just use a simple flag
|
||||
import math # Added for distance calculation in dual method
|
||||
import time # Ensure time is imported for MessageDeduplication
|
||||
from simple_bubble_dedup import SimpleBubbleDeduplication
|
||||
import difflib # Added for text similarity
|
||||
|
||||
class MessageDeduplication:
|
||||
def __init__(self, expiry_seconds=3600): # 1 hour expiry time
|
||||
self.processed_messages = {} # {message_key: timestamp}
|
||||
self.expiry_seconds = expiry_seconds
|
||||
|
||||
def is_duplicate(self, sender, content):
|
||||
"""Check if the message is a duplicate within the expiry period using text similarity."""
|
||||
if not sender or not content:
|
||||
return False # Missing necessary info, treat as new message
|
||||
|
||||
current_time = time.time()
|
||||
|
||||
# 遍歷所有已處理的消息
|
||||
for key, timestamp in list(self.processed_messages.items()):
|
||||
# 檢查是否過期
|
||||
if current_time - timestamp >= self.expiry_seconds:
|
||||
# 從 processed_messages 中移除過期的項目,避免集合在迭代時改變大小
|
||||
# 但由於我們使用了 list(self.processed_messages.items()),所以這裡可以安全地 continue
|
||||
# 或者,如果希望立即刪除,則需要不同的迭代策略或在 purge_expired 中處理
|
||||
continue # 繼續檢查下一個,過期項目由 purge_expired 處理
|
||||
|
||||
# 解析之前儲存的發送者和內容
|
||||
stored_sender, stored_content = key.split(":", 1)
|
||||
|
||||
# 檢查發送者是否相同
|
||||
if sender.lower() == stored_sender.lower():
|
||||
# Calculate text similarity
|
||||
similarity = difflib.SequenceMatcher(None, content, stored_content).ratio()
|
||||
if similarity >= 0.95: # Use 0.95 as threshold
|
||||
print(f"Deduplicator: Detected similar message (similarity: {similarity:.2f}): {sender} - {content[:20]}...")
|
||||
return True
|
||||
|
||||
# 不是重複消息,儲存它
|
||||
# 注意:這裡儲存的 content 是原始 content,不是 clean_content
|
||||
message_key = f"{sender.lower()}:{content}"
|
||||
self.processed_messages[message_key] = current_time
|
||||
return False
|
||||
|
||||
# create_key 方法已不再需要,可以移除
|
||||
# def create_key(self, sender, content):
|
||||
# """Create a standardized composite key."""
|
||||
# # Thoroughly standardize text - remove all whitespace and punctuation, lowercase
|
||||
# clean_content = ''.join(c.lower() for c in content if c.isalnum())
|
||||
# clean_sender = ''.join(c.lower() for c in sender if c.isalnum())
|
||||
|
||||
# # Truncate content to first 100 chars to prevent overly long keys
|
||||
# if len(clean_content) > 100:
|
||||
# clean_content = clean_content[:100]
|
||||
|
||||
# return f"{clean_sender}:{clean_content}"
|
||||
|
||||
def purge_expired(self):
|
||||
"""Remove expired message records."""
|
||||
current_time = time.time()
|
||||
expired_keys = [k for k, t in self.processed_messages.items()
|
||||
if current_time - t >= self.expiry_seconds]
|
||||
|
||||
for key in expired_keys:
|
||||
del self.processed_messages[key]
|
||||
|
||||
if expired_keys: # Log only if something was purged
|
||||
print(f"Deduplicator: Purged {len(expired_keys)} expired message records.")
|
||||
return len(expired_keys)
|
||||
|
||||
def clear_all(self):
|
||||
"""Clear all recorded messages (for F7/F8 functionality)."""
|
||||
count = len(self.processed_messages)
|
||||
self.processed_messages.clear()
|
||||
if count > 0: # Log only if something was cleared
|
||||
print(f"Deduplicator: Cleared all {count} message records.")
|
||||
return count
|
||||
|
||||
# --- Global Pause Flag ---
|
||||
# Using a simple mutable object (list) for thread-safe-like access without explicit lock
|
||||
# Or could use threading.Event()
|
||||
monitoring_paused_flag = [False] # List containing a boolean
|
||||
|
||||
# --- Global Error Handling Setup for Text Encoding ---
|
||||
def handle_text_encoding(text, default_text="[無法處理的文字]"):
|
||||
"""安全處理任何文字,確保不會因編碼問題而崩潰程序"""
|
||||
if text is None:
|
||||
return default_text
|
||||
|
||||
try:
|
||||
# 嘗試使用 utf-8 編碼
|
||||
return text
|
||||
except UnicodeEncodeError:
|
||||
try:
|
||||
# 嘗試將特殊字符替換為可顯示字符
|
||||
return text.encode('utf-8', errors='replace').decode('utf-8')
|
||||
except:
|
||||
# 最後手段:忽略任何無法處理的字符
|
||||
try:
|
||||
return text.encode('utf-8', errors='ignore').decode('utf-8')
|
||||
except:
|
||||
return default_text
|
||||
|
||||
# --- Color Config Loading ---
|
||||
def load_bubble_colors(config_path='bubble_colors.json'):
|
||||
"""Loads bubble color configuration from a JSON file."""
|
||||
@ -120,6 +217,9 @@ PROFILE_OPTION_IMG = os.path.join(TEMPLATE_DIR, "profile_option.png")
|
||||
COPY_NAME_BUTTON_IMG = os.path.join(TEMPLATE_DIR, "copy_name_button.png")
|
||||
SEND_BUTTON_IMG = os.path.join(TEMPLATE_DIR, "send_button.png")
|
||||
CHAT_INPUT_IMG = os.path.join(TEMPLATE_DIR, "chat_input.png")
|
||||
# 新增的模板路徑
|
||||
CHAT_OPTION_IMG = os.path.join(TEMPLATE_DIR, "chat_option.png")
|
||||
UPDATE_CONFIRM_IMG = os.path.join(TEMPLATE_DIR, "update_confirm.png")
|
||||
# State Detection
|
||||
PROFILE_NAME_PAGE_IMG = os.path.join(TEMPLATE_DIR, "Profile_Name_page.png")
|
||||
PROFILE_PAGE_IMG = os.path.join(TEMPLATE_DIR, "Profile_page.png")
|
||||
@ -1068,7 +1168,13 @@ class InteractionModule:
|
||||
|
||||
if copied and copied_text and copied_text != "___MCP_CLEAR___":
|
||||
print(f"Successfully copied text, length: {len(copied_text)}")
|
||||
return copied_text.strip()
|
||||
# 添加編碼安全處理
|
||||
try:
|
||||
safe_text = handle_text_encoding(copied_text.strip())
|
||||
return safe_text
|
||||
except Exception as e:
|
||||
print(f"Error handling copied text encoding: {str(e)}")
|
||||
return copied_text.strip() # 即使有問題也嘗試返回原始文字
|
||||
else:
|
||||
print("Error: Copy operation unsuccessful or clipboard content invalid.")
|
||||
return None
|
||||
@ -1601,13 +1707,22 @@ def perform_state_cleanup(detector: DetectionModule, interactor: InteractionModu
|
||||
|
||||
|
||||
# --- UI Monitoring Loop Function (To be run in a separate thread) ---
|
||||
def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queue):
|
||||
def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queue, deduplicator: 'MessageDeduplication'):
|
||||
"""
|
||||
Continuously monitors the UI, detects triggers, performs interactions,
|
||||
puts trigger data into trigger_queue, and processes commands from command_queue.
|
||||
"""
|
||||
print("\n--- Starting UI Monitoring Loop (Thread) ---")
|
||||
|
||||
# --- 初始化氣泡圖像去重系統(新增) ---
|
||||
bubble_deduplicator = SimpleBubbleDeduplication(
|
||||
storage_file="simple_bubble_dedup.json",
|
||||
max_bubbles=4, # 保留最近5個氣泡
|
||||
threshold=7, # 哈希差異閾值(值越小越嚴格)
|
||||
hash_size=16 # 哈希大小
|
||||
)
|
||||
# --- 初始化氣泡圖像去重系統結束 ---
|
||||
|
||||
# --- Initialization (Instantiate modules within the thread) ---
|
||||
# --- Template Dictionary Setup (Refactored) ---
|
||||
essential_templates = {
|
||||
@ -1639,7 +1754,9 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
|
||||
'page_sec': PAGE_SEC_IMG, 'page_str': PAGE_STR_IMG,
|
||||
'dismiss_button': DISMISS_BUTTON_IMG, 'confirm_button': CONFIRM_BUTTON_IMG,
|
||||
'close_button': CLOSE_BUTTON_IMG, 'back_arrow': BACK_ARROW_IMG,
|
||||
'reply_button': REPLY_BUTTON_IMG
|
||||
'reply_button': REPLY_BUTTON_IMG,
|
||||
# 添加新模板
|
||||
'chat_option': CHAT_OPTION_IMG, 'update_confirm': UPDATE_CONFIRM_IMG,
|
||||
}
|
||||
legacy_templates = {
|
||||
# Deprecated Keywords (for legacy method fallback)
|
||||
@ -1745,13 +1862,27 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
|
||||
elif action == 'clear_history': # Added for F7
|
||||
print("UI Thread: Processing clear_history command.")
|
||||
recent_texts.clear()
|
||||
print("UI Thread: recent_texts cleared.")
|
||||
deduplicator.clear_all() # Simultaneously clear deduplication records
|
||||
|
||||
# --- 新增:清理氣泡去重記錄 ---
|
||||
if 'bubble_deduplicator' in locals():
|
||||
bubble_deduplicator.clear_all()
|
||||
# --- 清理氣泡去重記錄結束 ---
|
||||
|
||||
print("UI Thread: recent_texts and deduplicator records cleared.")
|
||||
|
||||
elif action == 'reset_state': # Added for F8 resume
|
||||
print("UI Thread: Processing reset_state command.")
|
||||
recent_texts.clear()
|
||||
last_processed_bubble_info = None
|
||||
print("UI Thread: recent_texts cleared and last_processed_bubble_info reset.")
|
||||
deduplicator.clear_all() # Simultaneously clear deduplication records
|
||||
|
||||
# --- 新增:清理氣泡去重記錄 ---
|
||||
if 'bubble_deduplicator' in locals():
|
||||
bubble_deduplicator.clear_all()
|
||||
# --- 清理氣泡去重記錄結束 ---
|
||||
|
||||
print("UI Thread: recent_texts, last_processed_bubble_info, and deduplicator records reset.")
|
||||
|
||||
else:
|
||||
print(f"UI Thread: Received unknown command: {action}")
|
||||
@ -1776,6 +1907,19 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
|
||||
# --- If not paused, proceed with UI Monitoring ---
|
||||
# print("[DEBUG] UI Loop: Monitoring is active. Proceeding...") # DEBUG REMOVED
|
||||
|
||||
# --- 添加檢查 chat_option 狀態 ---
|
||||
try:
|
||||
chat_option_locs = detector._find_template('chat_option', confidence=0.8)
|
||||
if chat_option_locs:
|
||||
print("UI Thread: Detected chat_option overlay. Pressing ESC to dismiss...")
|
||||
interactor.press_key('esc')
|
||||
time.sleep(0.2) # 給一點時間讓界面響應
|
||||
print("UI Thread: Pressed ESC to dismiss chat_option. Continuing...")
|
||||
continue # 重新開始循環以確保界面已清除
|
||||
except Exception as chat_opt_err:
|
||||
print(f"UI Thread: Error checking for chat_option: {chat_opt_err}")
|
||||
# 繼續執行,不要中斷主流程
|
||||
|
||||
# --- Check for Main Screen Navigation ---
|
||||
# print("[DEBUG] UI Loop: Checking for main screen navigation...") # DEBUG REMOVED
|
||||
try:
|
||||
@ -1814,8 +1958,19 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
|
||||
# Use a slightly lower confidence maybe, or state_confidence
|
||||
chat_room_locs = detector._find_template('chat_room', confidence=detector.state_confidence)
|
||||
if not chat_room_locs:
|
||||
print("UI Thread: Not in chat room state before bubble detection. Attempting cleanup...")
|
||||
# Call the existing cleanup function to try and return
|
||||
print("UI Thread: Not in chat room state before bubble detection. Checking for update confirm...")
|
||||
|
||||
# 檢查是否存在更新確認按鈕
|
||||
update_confirm_locs = detector._find_template('update_confirm', confidence=0.8)
|
||||
if update_confirm_locs:
|
||||
print("UI Thread: Detected update_confirm button. Clicking to proceed...")
|
||||
interactor.click_at(update_confirm_locs[0][0], update_confirm_locs[0][1])
|
||||
time.sleep(0.5) # 給更新過程一些時間
|
||||
print("UI Thread: Clicked update_confirm button. Continuing...")
|
||||
continue # 重新開始循環以重新檢查狀態
|
||||
|
||||
# 沒有找到更新確認按鈕,繼續原有的清理邏輯
|
||||
print("UI Thread: No update_confirm button found. Attempting cleanup...")
|
||||
perform_state_cleanup(detector, interactor)
|
||||
# Regardless of cleanup success, restart the loop to re-evaluate state from the top
|
||||
print("UI Thread: Continuing loop after attempting chat room cleanup.")
|
||||
@ -1916,6 +2071,13 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
|
||||
print("Warning: Failed to capture bubble snapshot. Skipping this bubble.")
|
||||
continue # Skip to next bubble
|
||||
|
||||
# --- New: Image deduplication check ---
|
||||
if bubble_deduplicator.is_duplicate(bubble_snapshot, bubble_region_tuple):
|
||||
print("Detected duplicate bubble, skipping processing")
|
||||
perform_state_cleanup(detector, interactor)
|
||||
continue # Skip processing this bubble
|
||||
# --- End of image deduplication check ---
|
||||
|
||||
# --- Save Snapshot for Debugging ---
|
||||
try:
|
||||
screenshot_index = (screenshot_counter % MAX_DEBUG_SCREENSHOTS) + 1
|
||||
@ -1982,16 +2144,6 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
|
||||
perform_state_cleanup(detector, interactor) # Attempt cleanup
|
||||
continue # Skip to next bubble
|
||||
|
||||
# Check recent text history
|
||||
# print("[DEBUG] UI Loop: Checking recent text history...") # DEBUG REMOVED
|
||||
if bubble_text in recent_texts:
|
||||
print(f"Content '{bubble_text[:30]}...' in recent history, skipping this bubble.")
|
||||
continue # Skip to next bubble
|
||||
|
||||
print(">>> New trigger event <<<")
|
||||
# Add to recent texts *before* potentially long interaction
|
||||
recent_texts.append(bubble_text)
|
||||
|
||||
# 5. Interact: Get Sender Name (uses re-location internally via retrieve_sender_name_interaction)
|
||||
# print("[DEBUG] UI Loop: Retrieving sender name...") # DEBUG REMOVED
|
||||
sender_name = None
|
||||
@ -2069,6 +2221,32 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
|
||||
print("Error: Could not get sender name for this bubble, skipping.")
|
||||
continue # Skip to next bubble
|
||||
|
||||
# --- Deduplication Check ---
|
||||
# This is the new central point for deduplication and recent_texts logic
|
||||
if sender_name and bubble_text: # Ensure both are valid before deduplication
|
||||
if deduplicator.is_duplicate(sender_name, bubble_text):
|
||||
print(f"UI Thread: Skipping duplicate message via Deduplicator: {sender_name} - {bubble_text[:30]}...")
|
||||
# Cleanup UI state as interaction might have occurred during sender_name retrieval
|
||||
perform_state_cleanup(detector, interactor)
|
||||
continue # Skip this bubble
|
||||
|
||||
# If not a duplicate by deduplicator, then check recent_texts (original safeguard)
|
||||
# if bubble_text in recent_texts:
|
||||
# print(f"UI Thread: Content '{bubble_text[:30]}...' in recent_texts history, skipping.")
|
||||
# perform_state_cleanup(detector, interactor) # Cleanup as we are skipping
|
||||
# continue
|
||||
|
||||
# If not a duplicate by any means, add to recent_texts and proceed
|
||||
print(">>> New trigger event (passed deduplication) <<<")
|
||||
# recent_texts.append(bubble_text) # No longer needed with image deduplication
|
||||
else:
|
||||
# This case implies sender_name or bubble_text was None/empty,
|
||||
# which should have been caught by earlier checks.
|
||||
# If somehow reached, log and skip.
|
||||
print(f"Warning: sender_name ('{sender_name}') or bubble_text ('{bubble_text[:30]}...') is invalid before deduplication check. Skipping.")
|
||||
perform_state_cleanup(detector, interactor)
|
||||
continue
|
||||
|
||||
# --- Attempt to activate reply context ---
|
||||
# print("[DEBUG] UI Loop: Attempting to activate reply context...") # DEBUG REMOVED
|
||||
reply_context_activated = False
|
||||
@ -2115,34 +2293,71 @@ def run_ui_monitoring_loop(trigger_queue: queue.Queue, command_queue: queue.Queu
|
||||
|
||||
# 7. Send Trigger Info to Main Thread
|
||||
print("\n>>> Putting trigger info in Queue <<<")
|
||||
print(f" Sender: {sender_name}")
|
||||
print(f" Content: {bubble_text[:100]}...")
|
||||
try:
|
||||
# 安全地處理和顯示發送者名稱
|
||||
safe_sender_display = handle_text_encoding(sender_name, "[未知發送者]")
|
||||
print(f" Sender: {safe_sender_display}")
|
||||
|
||||
# 安全地處理和顯示消息內容
|
||||
if bubble_text:
|
||||
display_text = bubble_text[:100] + "..." if len(bubble_text) > 100 else bubble_text
|
||||
safe_content_display = handle_text_encoding(display_text, "[無法處理的文字內容]")
|
||||
print(f" Content: {safe_content_display}")
|
||||
else:
|
||||
print(" Content: [空]")
|
||||
except Exception as e_display:
|
||||
print(f"Error displaying message info: {str(e_display)}")
|
||||
|
||||
print(f" Bubble Region: {bubble_region}") # Original region for context
|
||||
print(f" Reply Context Activated: {reply_context_activated}")
|
||||
try:
|
||||
# 確保所有文字數據都經過安全處理
|
||||
data_to_send = {
|
||||
'sender': sender_name,
|
||||
'text': bubble_text,
|
||||
'bubble_region': bubble_region, # Send original region for context if needed
|
||||
'sender': handle_text_encoding(sender_name, "[未知發送者]"),
|
||||
'text': handle_text_encoding(bubble_text, "[無法處理的文字內容]"),
|
||||
'bubble_region': bubble_region,
|
||||
'reply_context_activated': reply_context_activated,
|
||||
'bubble_snapshot': bubble_snapshot, # Send the snapshot used
|
||||
'bubble_snapshot': bubble_snapshot,
|
||||
'search_area': search_area
|
||||
}
|
||||
trigger_queue.put(data_to_send)
|
||||
print("Trigger info (with region, reply flag, snapshot, search_area) placed in Queue.")
|
||||
|
||||
# --- 新增:更新氣泡去重記錄中的發送者信息 ---
|
||||
# 注意:我們在前面已經添加了氣泡到去重系統,但當時還沒獲取發送者名稱
|
||||
# 這裡我們嘗試再次更新發送者信息(如果實現允許的話)
|
||||
if 'bubble_deduplicator' in locals() and bubble_snapshot and sender_name:
|
||||
bubble_id = bubble_deduplicator.generate_bubble_id(bubble_region_tuple)
|
||||
if bubble_id in bubble_deduplicator.recent_bubbles:
|
||||
bubble_deduplicator.recent_bubbles[bubble_id]['sender'] = sender_name
|
||||
bubble_deduplicator._save_storage()
|
||||
# --- 更新發送者信息結束 ---
|
||||
|
||||
# --- CRITICAL: Break loop after successfully processing one trigger ---
|
||||
print("--- Single bubble processing complete. Breaking scan cycle. ---")
|
||||
break # Exit the 'for target_bubble_info in sorted_bubbles' loop
|
||||
|
||||
except Exception as q_err:
|
||||
print(f"Error putting data in Queue: {q_err}")
|
||||
# Don't break if queue put fails, maybe try next bubble? Or log and break?
|
||||
print(f"Error preparing or enqueueing data: {q_err}")
|
||||
# 嘗試使用最小數據集合保證功能性
|
||||
try:
|
||||
minimal_data = {
|
||||
'sender': "[數據處理錯誤]",
|
||||
'text': handle_text_encoding(bubble_text[:100] if bubble_text else "[內容獲取失敗]"), # Apply encoding here too
|
||||
'bubble_region': bubble_region,
|
||||
'reply_context_activated': False, # Sensible default
|
||||
'bubble_snapshot': bubble_snapshot, # Keep snapshot if available
|
||||
'search_area': search_area
|
||||
}
|
||||
trigger_queue.put(minimal_data)
|
||||
print("Minimal fallback data placed in Queue after error.")
|
||||
except Exception as min_q_err:
|
||||
print(f"Critical failure: Could not place any data in queue: {min_q_err}")
|
||||
# Let's break here too, as something is wrong.
|
||||
print("Breaking scan cycle due to queue error.")
|
||||
break
|
||||
|
||||
# End of keyword found block (if keyword_coords:)
|
||||
# End of keyword found block (if result:)
|
||||
# End of loop through sorted bubbles (for target_bubble_info...)
|
||||
|
||||
# If the loop finished without breaking (i.e., no trigger processed), wait the full interval.
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user