-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GGUF: C++ refactor, backend support, misc fixes #11030
GGUF: C++ refactor, backend support, misc fixes #11030
Conversation
if constexpr (std::is_same<T, uint32_t>::value) { | ||
GGML_ASSERT(val > 0 && (val & (val - 1)) == 0 && GGUF_KEY_GENERAL_ALIGNMENT " must be power of 2"); | ||
} else { | ||
GGML_ABORT(GGUF_KEY_GENERAL_ALIGNMENT " must be type u32"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be a static_assert ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it cannot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad! miss the first if ... I have to be more careful next time.
|
||
struct ggml_context * ctx_data = *params.ctx; | ||
|
||
struct ggml_tensor * data = nullptr; |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to use the struct keyword even if not strictly necessary in C++ code.
The MacOS compilation failure seems to be due to the implicit use of |
Yes, we should remove it now that the SYCL backend does not use it anymore. |
I went ahead and added myself to |
Or create the copy constructor for the struct gguf_tensor_info. |
I just realized that I (unintentionally) changed how arrays of size 1 are treated. With this PR they are converted to scalars, on master they are not. |
ggml/src/gguf.cpp
Outdated
// TODO this returns int but the underlying type is uint64 | ||
int gguf_get_n_kv(const struct gguf_context * ctx) { | ||
return ctx->kv.size(); | ||
} | ||
|
||
// TODO should this return uint64_t? | ||
int gguf_find_key(const struct gguf_context * ctx, const char * key) { | ||
// return -1 if key not found | ||
int keyfound = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think so. In general, we use size_t
for memory sizes and offsets expressed in bytes and int64_t
for counting things like number of elements. The GGUF spec slightly deviates from the convention and uses uint64_t
for both of these, which is not ideal. But at least the interface should match the underlying types.
Also, there are GGUF functions that use int
for indexing KV and tensors that would also need to be updated:
GGML_API uint8_t gguf_get_val_u8 (const struct gguf_context * ctx, int key_id);
GGML_API const char * gguf_get_tensor_name (const struct gguf_context * ctx, int i);
However, this might become a breaking change. Maybe we should do this in a follow-up PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the data type from a signed to an unsigned integer would definitely be a breaking change since the intended way to check whether a key exists is to check whether the returned index is non-negative. It would also be possible to return int64_t
since the maximum number of gguf_kv
/gguf_tensor_info
elements that can be allocated on a 64 bit system is going to be smaller than the maximum value that can be represented with int64_t
.
ggml/src/gguf.cpp
Outdated
// TODO should this be a const pointer? should it exist at all? | ||
void * gguf_get_data(const struct gguf_context * ctx) { | ||
return ctx->data; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can remove this.
An other idea for gguf_kv #include <string>
#include <cstdint>
#include <vector>
#include <memory>
[[noreturn]] void ggml_abort(const char * file, int line, const char * fmt, ...);
#define GGML_ABORT(...) ggml_abort(__FILE__, __LINE__, __VA_ARGS__)
enum gguf_type {
GGUF_TYPE_UINT8 = 0,
// [...]
GGUF_TYPE_STRING = 8,
// [...]
GGUF_TYPE_COUNT, // marks the end of the enum
};
template <typename T>
struct type_to_gguf_type;
template <>
struct type_to_gguf_type<uint8_t> {
static constexpr enum gguf_type value = GGUF_TYPE_UINT8;
};
// [...]
template <>
struct type_to_gguf_type<std::string> {
static constexpr enum gguf_type value = GGUF_TYPE_STRING;
};
class gguf_kv {
gguf_kv(gguf_kv&) = delete;
gguf_kv(gguf_kv&&) = delete;
gguf_kv* operator=(gguf_kv&) = delete;
gguf_kv* operator=(gguf_kv&&) = delete;
public:
gguf_kv(const std::string& key, gguf_type type):
key {key},
type {type}
{};
virtual ~gguf_kv() {};
const std::string & get_key() const {
return key;
}
const enum gguf_type & get_type() const {
return type;
}
virtual size_t get_ne() const = 0;
virtual const uint8_t & get_val_uint8(const size_t i = 0) const = 0;
// [...]
virtual const std::string & get_val_string(const size_t i = 0) const = 0;
protected:
const std::string key;
const gguf_type type;
};
template<typename T, bool VECTOR> struct gguf_kv_type;
template<typename T> struct gguf_kv_type<T, true> {
using type = std::vector<T>;
};
template<typename T> struct gguf_kv_type<T, false> {
using type = T;
};
template<typename T, bool VECTOR = false>
class gguf_kv_t : public gguf_kv {
private:
gguf_kv_t(gguf_kv_t&) = delete;
gguf_kv_t(gguf_kv_t&&) = delete;
gguf_kv_t* operator=(gguf_kv_t&) = delete;
gguf_kv_t* operator=(gguf_kv_t&&) = delete;
public:
using value_type = T;
using type = typename gguf_kv_type<T,VECTOR>::type;
gguf_kv_t(const std::string& key, const type& value):
gguf_kv(key, type_to_gguf_type<value_type>::value),
value(value)
{ }
size_t get_ne() const override {
if constexpr (VECTOR) {
return value.size();
} else {
return 1;
}
}
virtual const uint8_t & get_val_uint8(const size_t i = 0) const {
if constexpr (type_to_gguf_type<T>::value == GGUF_TYPE_UINT8) {
return get_val(i);
}
GGML_ABORT("Not a GGUF_TYPE_UINT8");
}
// [...]
virtual const std::string & get_val_string(const size_t i = 0) const {
if constexpr (type_to_gguf_type<T>::value == GGUF_TYPE_STRING) {
return get_val(i);
}
GGML_ABORT("Not a GGUF_TYPE_STRING");
}
private:
inline const type & get_val(const size_t i = 0) const {
if constexpr (VECTOR) {
return value[i];
} else {
return value;
}
}
protected:
type value;
};
int main() {
std::unique_ptr<gguf_kv> t_string {new gguf_kv_t<std::string, false>("key", "value")};
std::unique_ptr<gguf_kv> t_uint8 {new gguf_kv_t<uint8_t, false>("key", 0)};
std::vector<std::unique_ptr<gguf_kv>> list;
list.push_back(std::move(t_string));
list.push_back(std::move(t_uint8));
} |
I considered making |
ggml/src/gguf.cpp
Outdated
const uint64_t n_kv = src->kv.size(); | ||
for (uint64_t i = 0; i < n_kv; ++i) { | ||
const struct gguf_kv & kv = src->kv[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for (const auto & kv : src->kv) {
[...]
switch (kv.get_type())
[...]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My personal preference it to use loop variables and indexing since I find that to be more convenient in conjunction with GNU Debugger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you what to use indexing (a bad idea for me in that case) you have to use
std::vector<struct gguf_kv>::size_type ...
// or auto when possible
and not a uint64_t.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint64_t
is acceptable here because it's the underlay type of n_kv
when writing to/reading from a file. Another option is size_t
, as we usually use it in the llama.cpp code base.
|
Thanks, I wasn't aware. The only reason why I switched from |
or use the |
I don't remember the exact context but I previously (I think it was in 2023) made a PR with portability issues related to files on Windows and when I suggested using |
My experience is from quite a few years ago and things have likely improved. Moreover, we are already using But in any case, if |
I changed the GGUF API as follows:
|
remove ggml_tensor.backend update CODEOWNERS [no ci] remove gguf_get_data from API revise GGUF API data types
1f36d19
to
b456e10
Compare
ggml/src/gguf.cpp
Outdated
return fread(dst.data(), 1, dst.length(), file) == dst.length(); | ||
} | ||
|
||
bool read(void * dst, const size_t size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool read(void * dst, const size_t size) { | |
bool read(void * dst, const size_t size) const { |
This PR splits the GGUF code into a separate file and refactors it as C++, see discussion in #10655 . In addition to the lists in the linked PR:
gguf_write_to_file
fromvoid
tobool
. The return value indicates success. On master a failure to open the output file results in a crash, a failure to write to the file is silently ignored.general.alignment
touint32_t
and powers of 2. On master this key can be set to other types (allowing users to write a file that then causes an error on read) and other values (which don't work correctly withGGML_PAD
). There is now a macroGGUF_KEY_GENERAL_ALIGNMENT
since this key has a special meaning.gguf_get_arr_data
on a string array an error is raised. On master this returns a pointer of typegguf_str
, a type defined inggml.c
. I would consider this a misuse of the API.std::stringstream
is used intest-gguf
which (unliketmpfile
) should work without elevated privileges on Windows.What could still be done: