For educational purpose only.
Brave Browser has introduced a Chatbot which runs on Mistrial 8x7B, one of the best open source LLMs as of writing. The idea is that Brave shall setup a reverse proxy to mask source IP from users from model hosts, which also enables billing. No registration required to access free quota.
Access to such API would require a bit of reverse engineering effort.
V1 API
The V1 API under ai-chat.bsg.brave.com/v1/complete has basically no protection: a single static x-brave-key
header is used for protection, which is trivial to acquire: some quick SSL decrypt with Charles or BurpSuite would reveal the value.
Newer models remain accessible via V1 API as of writing.
V2 API
Brave has introduced V2 API ai-chat.bsg.brave.com/v2/complete for Mistrial 8x7B model, and introduced HTTP Message Signatures for authentication.
For this API:
- A
x-brave-key
header is still required which is NOT a part of HTTP Message Signatures RFC; - Signing algorithm is SHA-256;
- Signing is done via Pre-shared Key;
- Multiple
key-id
are active at the same time with format of{os}-{chrome-major-ver}-{channel}
, e.g.,linux-121-nightly
which is used to differentiate between PSKs; - No expiry or CSRF needed which makes replay possible;
- The only field signed is
digest
- more on that later.
HTTP Message Signatures
HTTP Message Signatures is probably designed for message integrity verification, while allowing modification of HTTP headers with SNI proxy. Traditionally HTTP(without S) is considered unsafe, while SSL/TLS is considered safe against decryption or modification. This idea is critical for Internet traffic but poses challenges for controlled network, like company or school network where traffic monitoring is expected for data loss prevention. In those cases a private Root Cert is usually installed on devices which enables monitoring, but also enables modification which is undesired for service owners. HTTP Message Signatures can mitigate this issue with another layer of signature on selected scope, ensuring integrity of those fields while leaving modification open for other unprotected parts.
Implementation
Brave decides to only protect the message body against tampering, with exact steps of:
- Create the message body for HTTP POST: note it is possible to sign other HTTP methods;
- (not really used here) Add protection for other headers, the target host, or the HTTP verb; Combine them with HTTP body for a message to be signed;
- Calculate SHA-256 hash for the message and encode with Base64 -
base64.b64encode(hashlib.sha256(body.encode('utf-8')).digest())
; - Arrange sequence of output fields exactly as input's since wrong order shall result in different hash; In this case the output is
digest: SHA-256={Base64-encoded body};
- Use the pre-shared key to sign this output:
base64.b64encode(binascii.unhexlify(hmac.new('{Pre-Shared-Key}'.encode('utf-8'), "digest: SHA-256={Base64-encoded body}".encode('utf-8'), hashlib.sha256).hexdigest()))
; - Combine into header:
'Host': 'ai-chat.bsg.brave.com', 'pragma': 'no-cache', 'cache-control': 'no-cache', 'accept': 'text/event-stream', 'authorization': 'Signature keyId="{os}-{chrome-major-ver}-{channel}",algorithm="hs2019",headers="digest",signature="{Signature of that header}"', 'digest': 'SHA-256={Base64-encoded body signature}', 'x-brave-key': '{V1 key}', 'content-type': 'application/json', 'sec-fetch-site': 'none', 'sec-fetch-mode': 'no-cors', 'sec-fetch-dest': 'empty', 'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/{Major Chromium Version}.0.0.0 Safari/537.36', 'accept-language': 'en-US,en' }
- Send to
https://ai-chat.bsg.brave.com/v2/complete
and stream the response.
Reverse Engineering the Pre-Shared Key
You would need a tool to extract all the strings in binary, like IDA, Hopper or Binary Ninja. Charles or BurpSuite is also required to gather ground truth.
- Start with SSL proxy and decrypt the domain to gather a ground truth of body with respected
digest
andsignature
. - Try to reproduce
digest
with given body: you may need to tweak the body for invisible characters. A working version looks likebody = '{"max_tokens_to_sample":600,"model":"mixtral-8x7b-instruct","prompt":"\\u003Cs>[INST] \\u003C\\u003CSYS>>\\nThe current time and date is Monday, January 30, 2024 at 0:00:00\u202fPM\\n\\nYour name is Leo, a helpful, respectful and honest AI assistant created by the company Brave. You will be replying to a user of the Brave browser. Always respond in a neutral tone. Be polite and courteous. Answer concisely in no more than 50-80 words.\\n\\nPlease ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don\'t know the answer to a question, please don\'t share false information.\\n\\nUse unicode symbols for formatting where appropriate. Use backticks (`) to wrap inline coding-related words and triple backticks along with language keyword (```language```) to wrap blocks of code or data.\\n\\u003C\\u003C/SYS>>\\n\\nhi [/INST] ","stop_sequences":["\\u003C/response>","\\u003C/s>"],"stream":true,"temperature":0.2,"top_k":-1,"top_p":0.999}'
- From the RFC we know that
hs2019
requires minimal 32 bits of input - in this case, a 64-char hex number as string. Export all strings of this length and manually review them: pickup potential ones. - Finally, attempt to reproduce
signature
with given PSK.
Hint: The key may be located close to chat-related strings.