Questions tagged [accelerate]

Make large-scale mathematical computations and image calculations, optimized for high performance.

74 questions
1
vote
2 answers

Keep SQL query general and make it fast

After finding the right SQL query for my purposes, I realized that my query is slow. WITH temp_table (t_col_1, t_col_2, t_col_3) AS ( SELECT col_1 AS t_col_1, col_2 AS t_col_2, col_3 AS t_col_3 FROM actual_table WHERE ID = 100 AND PID =…
grima
  • 75
  • 5
1
vote
1 answer

trouble installing accelerate-cuda

I have been trying to run cabal install accelerate-cuda -fdebug to no avail. at first i had some issues i think with my version of cuda, so i upgraded ghc to version 8.0.1 and cabal to version 1.22.5.0. I was able to run cabal install accelerate…
vivace
  • 59
  • 9
1
vote
1 answer

`scipy.optimize.root` faster root finding

I use scipy.optimize.root with the hybr method (best one ?) to find the root of a numeric function I print the residual at each iteration delta d 117.960112417 delta d 117.960112417 delta d 117.960112417 delta d 117.960048733 delta d…
Covich
  • 2,544
  • 4
  • 26
  • 37
0
votes
0 answers

Loading a HF Model in Multiple GPUs and Run Inferences in those GPUs (Not Training or Finetuning)

Is there any way to load a Hugging Face model in multi GPUs and use those GPUs for inferences as well? Like, there is this model which can be loaded on a single GPU (default cuda:0) and run for inference as below: from transformers import…
0
votes
0 answers

How to max out GPU RAM usage while fine-tuning Huggingface LLMs? Error with `per_device_train_batch_size` Trainer argument

I have an A100 (Colab Pro) with 40GB GPU memory and want to fine-tune an LLM utilizing the GPU's full capacity. When I increase per_device_train_batch_size argument in Trainer's TrainingArguments to anything other than 1, I receive an…
0
votes
0 answers

A fast way to extract channels data from CVPixelBuffer of ARGB format

I have CVPixelBuffer that has kCVPixelFormatType_32ARGB format. I need to extract each channel and store it to separate CVPixelBuffer. I imagine the usage of it like this let channels: [CVPixelBuffer] = pixBuffer.split() So this, for the time being,…
0
votes
1 answer

How to accelerate data access in PieCloudDB Database?

Is there any way for PieCloudDB Database to accelerate data access? Does it have data cache? If yes, how to enable data cache in PieCloudDB?
0
votes
0 answers

Why is Accelerate faster than MetalKit in my image processing function?

I have a function I've implemented naively, using Accelerate, and using MetalKit with performances of 18 seconds, 9 seconds and 14 seconds respectively. Since the function involves modifying pixel values, I assumed MetalKit would perform best.…
cyril
  • 3,020
  • 6
  • 36
  • 61
0
votes
0 answers

Error in clip_grad_norm_ for bf16 via PEFT

I am using PEFT code to fine-tune a model while I use accelerate with bf16 to reduce the memory usage. When I call accelerate.clip_grad_norm_(model.parameters(), max_norm=1) I am getting ValueError: Requires uniform dtype across all gradients but…
Afshin Oroojlooy
  • 1,326
  • 3
  • 21
  • 43
0
votes
0 answers

How to visualize a live soundwave from a radio stream in Swift?

I am working on a SwiftUI iOS app that plays a live radio stream using AVPlayer. I would like to visualize the soundwave of the live radio stream and have it pulsate to the rhythm of music played in the stream. I have already implemented the audio…
TD540
  • 1,728
  • 3
  • 14
  • 26
0
votes
0 answers

How can I create a spectrogram from an audio file?

I have tried to create a spectrogram using this apple tutorial but it uses live audio input from the microphone. I want to create one from an existing file. I have tried to convert apples example from live input to existing files with no luck, so I…
Trevor
  • 580
  • 5
  • 16
0
votes
0 answers

How do I remove noise from an audio signal? And what threshold/threshold range should I use?

I have loaded an audio file and have created an input and output buffer. But when I follow Apple's post I get an output signal that has distorted sounds. private func extractSignal(input: AVAudioPCMBuffer, output: AVAudioPCMBuffer) { let count…
Scott McKenzie
  • 16,052
  • 8
  • 45
  • 70
0
votes
0 answers

How to specify gpu device numbers when using accelerate's notebook_launcher?

When using Accelerate's notebook_launcher to kickoff a training job spawning across multiple GPUs, is there a way to specify which GPUs (i.e. CUDA_VISIBLE_DEVICES=“4,5,6,7”) to be used, in stead of starting with default cuda:0? from accelerate…
0
votes
1 answer

Swift performance - efficient calculation of boolean logical operations

I need to work with large Double and Boolean arrays / vectors and apply simple operations on them. E.g. additions, subtractions, multiplications, smaller, larger etc. on the Doubles and AND, OR, NOT etc. on the Bool ones. While I have found vDSP…
Jay
  • 59
  • 6
0
votes
2 answers

iOS Accelerate: Put luma and chroma buffers in a single CVPixelBuffer

I am converting camera output 420YpCbCr8BiPlanarFullRange to ARGB8888 to order to perform some image processing. I need to convert the outcome back to 420YpCbCr8BiPlanarFullRange to stream it with webRTC: func convertTo420Yp8(source: inout…