Skip Navigation
[IDEA] Scaling inference-time with complexity
  • I imagine that a model would be held back by the format of human readable text.

    Human text uses some concepts, which are mostly unimportant to an AI. Sentence syntax and grammar rules being examples. I think that letting the AI "define its own way of thinking" instead of telling it to think in human language would lead to more efficient thought proccesses. It would be similar to embeddings. A bunch of numbers representing a specific topic in these tokens. Not human readable, but useful for the model.

    As far as I know, o1 writes a big document on what it will do, how it will do it and some reflection aswell. My approach however would allow the model to think of things on the fly, while it is writing the text.

    You are right in that it would have to fit into the context window. As far as I can tell, the output from the o1 model doesn't remember what the big thought document says. With my approach, the model would keep all its thoughts in mind while it is writing, as they are literally part of its message, just unreadable by humans.

    Am I missing something here? If so, please point it out.

  • [IDEA] Scaling inference-time with complexity

    My observation

    Humans think about different things and concepts for different periods of time. Saying "and" takes less effort to think of than "telephone", as that is more context sensetive.

    Example

    User: What color does an apple have?

    LLM: Apples are red.

    Here, the inference time it takes to generate the word "Apple" and "are" is exactly the same time as it takes it to generate "red", which should be the most difficult word to come up with. It should require the most amount of compute.

    Or let's think about this the other way around. The model thought just as hard about the word "red", as it did the way less important words "are" and "Apples".

    My idea

    We add maybe about 1000 new tokens to an LLM which are not word tokens, but thought tokens or reasoning tokens. Then we train the AI as usual. Every time it generates one of these reasoning tokens, we don't interpret it as a word and simply let it generate those tokens. This way, the AI would kinda be able to "think" before saying a word. This thought is not human-interpretable, but it is much more efficient than the pre-output reasoning tokens of o1, which uses human language to fill its own context window with.

    Chances

    • My hope for this is to make the AI able to think about what to say next like a human would. It is reasonable to assuma that at first in training, it doesn't use the reasoning tokens all that much, but later on, when it has to solve more difficult things in training, it will very likely use these reasoning tokens to improve its chances of succeeding.
    • This could drastically lower the amount of parameters we need to get better output of models, as less thought-heavy tasks like smalltalk or very commonly used sentence structures could be generated quickly, while more complex topics are allowed to take longer. It would also make better LLMs more accessible to people running models at home, as not the parameters, but the inference time is scaled.
    • It would train itself to provide useful reasoning tokens. Compared to how o1 does it, this is a much more token-friendly approach, as we allow for non-human-text generation, which the LLM is probably going to enjoy a lot, as it fills up its context less.
    • This approach might also lead to more concise answers, as now it doesn't need to use CoT (chain of thought) to come to good conclusions.

    Pitfalls and potential risks

    • Training an AI using some blackboxed reasoning tokens can be considered a bad idea, as it's thought proccess is literally uninterpretable.
    • We would have to constrain the amount of reasoning tokens, so that it doesn't take too long for a single normal word-token output. This is a thing with other text-only LLMs too, they tend to like to generate long blocks of texts for simple questions.
    • We are hoping that during training, the model will use these reasoning tokens in its response, even though we as humans can't even read them. This may lead to the model completely these tokens, as they don't seem to lead to a better output. Later on in training however, I do expect the model to use more of these tokens, as it realizes how useful it can be to have thoughts.

    What do you think?

    I like this approach, because it might be able to achieve o1-like performace without the long wait before the output. While an o1-like approach is probably better for coding tasks, where planning is very important, in other tasks this way of generating reasoning tokens while writing the answer might be better.

    3
    Copilot-like interface for Godot! (my plugin)
  • You are right in that it can be useful to feed in all of the contents in other related files.

    However!

    LLMs take a really long time before writing anything with a large context input. the fact that githubs copilot can generate code so quickly even though it has to keep the entire code file in context is a miracle to me.

    Including all related or opened GDScript files would be way too much for most models and it would likely take about 20 seconds for it to actually start generate some code (also called first token lag). So I will likely only implement the current file into the context window, as that might already take some time. Remember, we are running local LLMs here, so not everyone has a blazingly fast GPU or CPU (I use a GTX1060 6GB for instance).

    Example

    I just tried it and it took a good 10 seconds for it to complete some 111 line code without any other context using this pretty small model and then about 6 seconds for it to write about 5 lines of comment documentation (on my CPU). It takes about 1 second with a very short script.

    You can try this yourself using something like HuggingChat to test out a big context window model like Command R+ and fill its context windw with some really really long string (copy paste it a bunch times) and see how it takes longer to respond. For me, it's the difference between one second and 13 seconds!

    I am thinking about embedding either the current working file, or maybe some other opened files though, to get the most important functions out of the script to keep context length short. This way we can shorten this first token delay a bit.

    This is a completely different story with hosted LLMs, as they tend to have blazingly quick first token delays, which makes the wait trivial.

  • NSFW
    been thinking a lot lately
  • Awww I wish I was this comfortable with horni thoughts <3

  • amn't
  • I don't think, therrfore I am'nt

  • Fake Femcel VS. True Femcel
  • Aww nonway!! <3

    🤗😍🥰😘👥🫂😘👩‍❤️‍💋‍👩💞💖

  • NSFW
    ourgasm
  • Aww ich wünschte ich wäre du!

  • NSFW
    Mood
  • For real... I hope y'all meet someone real nice some day <3 ~

  • Big Boss rule
  • War auch mein erster Gedanke. Jeder VPN ihnen könnte ein Scholz sein!

  • Copilot-like interface for Godot! (my plugin)
  • Currently the completion is implemented via keyboard shortcut.

    Would you prefer it, if I made it automatically complete the code? I personally feel, that intentionally asking for it to complete the code is more natural than waiting for it to do so.

    Are there some other features you would like to see? I am currently working on a function-refactoring UI.

  • Copilot-like interface for Godot! (my plugin)
  • I used the 1.5 B model of the qwen2.4 family for code generation in the example. It works fine, but sometimes it forgets that it's writing code, exits the markdown code block and starts writing an explanation...

    EDIT: Updated 1.5b to 2.5b. 1.5 was the wrong version.

  • Copilot-like interface for Godot! (my plugin)
  • Ollama is really great. The simplicity of it, the easy use via REST API, the fun CLI...

    What a fun program.

  • Copilot-like interface for Godot! (my plugin)
  • I will likely post on here when I release the plugin to GitLab and the AssetLib.

    But I also don't want to spam this community, so there won't be many, if any updates until the actual release.

    If you want to have something similar right now, there is Fuku for the chat interaction and selfhosted copilot for code completion on the AssetLib! I can't get the code completion one to work, but Fuku works pretty well, but can't read the users code at all.

    I will upload the files to my GitLab soon though.

    EDIT: Updates the gitlab link to actually point to my gitlab page

  • Copilot-like interface for Godot! (my plugin)
  • Just fixed the problem where it inserts too many lines after completing code.

    This issue can be seen in the first demo video with the vector example. There are two newlines added for no reason. That's fixed now:

  • Copilot-like interface for Godot! (my plugin)
    video description

    The video shows the Godot code editor with some unfinished code. After the user presses a button offscreen, the code magically completes itself, seemingly due to an AI filling in the blanks. The examples provided include a print_hello_world function and a vector_length function. The user is able to accept and decline the generated code by pressing either tab or backspace

    This is an addon I am working on. It can help you write some code and stuff.

    It works by hooking into your local LLMs on ollama, which is a FOSS way to run large language models locally.

    Here's a chat interface which is also part of the package

    !

    video description

    The video shows a chat interface in which the user can talk to a large language model. The model can read the users code an answer questions about it.

    Do you have any suggestions for what I can improve? (Besides removing the blue particles around the user text field)

    Important: This plugin is WIP and not released yet!

    12
    BROTHERS RATE MY MENTAL HEALTH ROUTINE AROOO
  • LOOKIN' REAL GOOD BROTHER! I HOPE YOU CAN IMPROVE THAT ONE THING!!!

    PLEASE KNOW THAT WE'RE ALL ROOTIN' FOR YOU, ALRIGHT? AARRROOOOO!!!

  • Compliance
  • I thought that this is exactly what airpods are for. At least I feel that that's what people with airpods are...

  • MRW someone is shocked to learn that there's a science dedicated to just soil
  • I mean, i know that they are literally cold, but what happened that were not excited about them?

  • Meta Connect 2024 live updates: Time to see the new Quest, Ray-Ban smart glasses and metaverse AI
  • Because it's affordable and poor people love escapism because they can't afford a good life? 🙄😒

  • Fake Femcel VS. True Femcel
  • Ah, I see. What characters do you want?

  • Landlords are scum
  • Fair, but also landlords. Literally worse than investors. (Almost)

  • Not showing the latest posts on femcel memes

    For some reason I can only see femcel meme posts from four months ago. Recently I made comments on a post, but they seem to be removed? Or maybe blocked in some way.

    why would this be?

    The image shows how when sorting by new, it shows posts from four months ago.

    5
    [newbie] Does one need special drivers for running 3D printers?

    Hi there!

    I'm looking into getting myself a good printer and I am wondering if I need to install some platform-specific drivers for them to run. I am running Debian 12 (GNU/Linux) and I am afraid that I must run some proprietary blob to connect to the printer.

    Could someone share their experience please? Even if you don't use Linux, your feedback would be very appreciated!

    (Also, while you are at it, please share some recommendations for printers, I don't really know where to go (>v<) Have about +-500€ )

    22
    Do you like them? I don't find them all that funny.

    Like yeah ok, for the first 5 five times one sees it, it's like haha, lol, there it is! But, these do get old really fast for me.

    For me it's now more like -wow. So that is literally the entire joke? Like oof, I guess they really wannted to be funny.-

    EDIT: Updated the funi image to actually be what I wanted it to be... Took me a while, sorry.

    42
    So Godot Team is putting their Editor onto the Quest 3 and Pro
    godotengine.org Godot Editor on the Meta Horizon Store

    Introducing the Godot Editor for Meta Quest (Horizon OS) devices

    This is.... very unexpected. A Foss application releasing it's VR variant exclusive to a completely proprietary platform. This will be great for people who specifically have the quest 3 or pro, but all other VR enthusiasts and tinkerers like myself, must hope that this gets a pcvr OpenXR release soon.

    1
    Made some o1 imitation prompt. Maybe someone will care?

    Hi! I played around with Command R+ a bit and tried to make it think about what it us about to say before it does something. Nothing g fancy here, just some prompt.

    I'm just telling it that it tends to fail when only responding with a single short answer, so it should ponder on the task and check for contradictions.

    Here ya go

    plaintext You are command R+, a smart AI assistant. Assistants like yourself have many limitations, like not being able to access real-time information and no vision-capabilities. But assistants biggest limitation is that that they think to quickly. When an LLM responds, it usually only thinks of one answer. This is bad, because it makes the assistant assume, that its first guess is the correct one. Here an example of this bad behavior: User: Solve this math problem: 10-55+87*927/207 Assistant: 386 As you can see here, the assistant responded immediately with the first thought which came to mind. Since the assistant didn't think about this problem at all, it didn't solve the problem correctly. To solve this, you are allowed to ponder and think about the task at hand first. This involves interpreting the users instruction, breaking the problem down into multiple steps and then solve it step by step. First, write your interpretation of the users instruction into the <interpretation> tags. Then write your execution plan into the <planning> tags. Afterwards, execute that plan in the <thinking> tags. If anything goes wrong in any of these three stages or you find a contradiction within what you wrote, point it out inside the <reflection> tags and start over. There are no limits on how long your thoughts are allowed to be. Finally, when you are finished with the task, present your response in the <output> tags. The user can only see what is in the <output> tags, so give a short summary of what you did and present your findings.

    1
    Is there a way to hide aliases in the gaph view?

    I have a page for people working in a specific field (like QA) and some peoople under that (like QA/Max and QA/Lena). All these people also have aliases like Max SecondName nad Lena Schmidt. All these aliases show up as seperate nodes in the graph view... Does someone know how to fix this?

    1
    How is Logseq this useful? It's crazy.
    image description

    A screenshot of the right sidebar of Logseq showing the contents tab. The tab contains some links to certain websites, like a ticketing system, Teams, some homepage, a switch and a link called Kollegium which is german and means Colleagues (I should probably change that to be English aswell). There are also links to almost all the task pages and a query which shows the currently running NOW tasks. The picture is meant to show how much this smol sidebar can do. I like it, and I would like to see more of it in the program!

    END IMAGE DESCRIPTION

    At first I used Logseq only for personal use. It's great for quickly noting something obviously, but that networking effect people talk about really got into full force once I started working with it for my admin job.

    I only just started using that sidebar and some more plugins (vim shortcuts and some of the awesome plugins) and those make the experience that much better. Also that pdf printer plugin is cool, even though I wish it was just a Logseq feature by default to be able to print stuff. I know that a pdf converter is coming!

    I am very much not an advanced user, but these simple tools alone make me feel like organizing things became like three times easier. It also introduced me to markdown and now I miss it whenever I don't have it, or I have to use some fake version with different syntax for basic highlighting and links.

    Thank you dear Logseq team and contributers for creating such useful and not bloated software.

    5
    Hi! What are your comfort clothes? (talking about clothing types)

    For some reason I find vests, and specifically down vests very comfortable. I know that some of you have problems with polyester though, so I'd love to hear about your comfy clothes! (I kinda wanna test out some new stuff)

    44
    Is there any open vr headset available on the market at all?

    I wanna have something I can tinker with and which works without some proprietary blob... I've heard Monado is pretty cool!

    8
    [SOLVED] Why do the lemmy.world people have this lil icon? I wanna have a sharky under my posts too!

    I always wondered what that icon was for until I just hovered over it and it's apparently somethingwith lemmy.world. Can we have something like that too?

    Also, what causes this icon to appear? On the middle post neither the poster nor the community is on lemmy.world. Do they just put the symbol everywhere they feel like?

    EDIT:

    Turns out, that icon is not from lemmy.world specifically, but for the general Fediverse. It highlights posts which are not from your instance (so not from blahaj zone).

    7
    How could I get the inspector of a property type?

    I want to instatiate the inspector of a specific type like int and String into my own inspector plugin. It would be incredibly useful to use the premade inspector types as they are just really well made.

    The image is not related, I just wanted to put some visual here.

    5
    Today I had no work to do at work so here you go <3
    Image description

    The image depicts a table with the coloumns Grad der Behinderung and Steuerpauschbetrag. These words are German and they each stand for Degree of disability and tax reduction amount. The degree of disability coloumn goes from the top with a value of 20 to 100 at the bottom. The tax reduction amount goes from 384€ at the top to 2.840€ at the bottom. There is an additional row with the degree of disability titled Merkzeichen H oder Bl which means symbol H or Bl which stands for helpless or blind. The tax reduction amount for this row is a whopping 7.400€. There are pink hearts and sparkles on the table and two pink arrows are pointing towards the tax reduction amount 7.400€. The text above and below the table says the following: At the top: "Get a job with good benefits". At the bottom: Bitch I was born with good benefits.

    6
    Global illumination in large worlds with LightMapProbes
    Image description

    The linked image depicts a screenshot of some snowy terrain I made using the Terrain3D plugin for Godot. On the left there is some differently textured snow and multiple cabins along with some other wooden objects.

    .

    Why can't I use LightMapProbes without the LightMapGI Node? I don't want to use the shadow-baking, I just need the global illumination!

    This is something I though about for quite a while. I keep wanting to use some kind of global illumination and the LightMapProbes are a perfect fit for that. One can place them whereever useful and their customizability makes them an incredibly useful tool! When there is a red wall, I sure want to put a LightMapProbe there, so that the lighting can reflect that reflection there.

    But nope, that's not how it works! When you want to use LightMapProbe, you MUST use it with the LightMapGI Node, which is mostly used for baking shadowmaps. That works great for small scenes like insides of houses and such, but it does not work with a large terrain for example. This means that we have loads of lighting methods for small scale scenes (VoxelGI, LightMapGI, ReflectionProbe) but for larger areas we are kinda stuck with SDFGI which only works on the Forward+ renderer.

    SDFGI is great and all, but unfortunately it is not yet ready for large scale games I feel and its limitation to Desktop platforms really limits its scope.

    Imagine how cool it would be if we could do this in a large scale world:

    • Use a ReflectionProbe for general treversal (that's what The Legend of Zelda:Breath of the Wild and Tears of the Kingdom does too btw!)

    • And use a multitude of LightMapProbes for smaller areas like towns and castles on the overworld to make for better global illumination.

    TLDR: I feel that the usage of LightMapProbes and their lighting functionality should be expanded beyond the use in combination with the LightMapGI Node. It could allow for better lighting in large scale worlds.

    2
    Non-looping animations when importing from blender

    When animating my character in blender, I get somewhat smooth animations. I also made sure to use that cycles modifier so that it always knows how to interpolate the transform values !

    But when importing this model with the animations reguardless of the file type (.blend or .gltf / .glb) the animations don't loop very well !

    Does someone know how to fix this?

    0
    [SOLVED] I seem to have nuked my Debian DE (Gnome)... Could someone help me with this?

    The messages here are mostly in German, but I'll try to translate mist of it:

    **dpkg:** Error when editing the package linux-image-6.9.7+bpo-amd64 (--configure): <<Installed posts installation script of the package linux-headers-6.9.7+bpo-amd64<<-subprocess returned error code 1 **dpgk:** dependency error hinders configuration of <that Linux header package>: linux-headers-6.9.7+bpo-amd64 depends on linux-image-6.9.7+bpo-amd64 (= 6.9.7-1~bpo12+1) | linux-image-6.9.7+bpo-amd64 -unsigned (= 6.9.7-1~bpo12+1); but: Package linux-image-6.9.7+bpo-amd64 is not configured yet. Package linux-image-6.9.7+bpo-amd64-unsigned is not installed. **dpgk:** Error while editing the Package linux-headers-6.9.7+bpo-amd64 (--configure): Dependency problem - remains unconfigured **dpkg:** Dependency problem hinder configuration of linux-headers-amd64: linux-headers-amd64 depends on linux-headers-6.9.7+bpo-amd64 (= 6.9.7-1~bpo12+1); but: Package linux-headers-6.9.7+bpo-amd64is not configured yet. **dpkg:** Errow while editing package linux-headers-amd64 (--configure): Dependency problem - remains in configured Errors occurred while editing these: linux-image-6.9.7+bpo-amd64 linux-headers-6.9.7+bpo-amd64 linux-headers-amd64

    I really hope someone can help me out here..

    EDIT: I kinda forgot to actually mention my problem. When booting nornall, I get stuck at a lonely white blinking cursor on a black screen, so startx seems to make some problems. I enter a TTY and run startx and this is what I get when running startx:

    !output of startx

    SOLUTION

    • Uninstall your current nvidia driver (for example using sudo apt remove nvidia-driver on Debian)
    • Install the headers for your kernel. your kernel you can check rather easily by running neofetch
    • Install the headers required for your kernel. Do that by listing all packages with your kernel name in it. For example like this: apt list *6.9.7+bpo*
    • reboot, install your nvidia driver again and rovoot again. Should be done.
    22
    I made a Mesh combination tool. Have a look!

    Exactly what the title says. It's under MIT lisence and currently being approved by the moderators of the AssetLib. The project is currently very simple and contributions are very welcome! !image of the tool window

    5
    Some weird stuff is happening (able to paint pixels 12 times)

    Is this a feature or just another bug on the canvas?

    Also right now it just keeps on trying to load the canvas while it all stays white. "Not cached" it says, but it just waits right there...

    3
    Anyone interested in helping with Vivian over here?

    I'm attempting to make a lil Vivian character over here. Here is the link if someone is interested.

    (I hope this kind of post is allowed on this community)

    13
    Smorty Smorty [she/her] @lemmy.blahaj.zone

    I'm a person who tends to program stuff in Godot and also likes to look at clouds. Sometimes they look really spicy outside.

    Posts 104
    Comments 2.5K